Bug 996038 - it takes 8~30 minutes or more to resume rhel guest from S4
it takes 8~30 minutes or more to resume rhel guest from S4
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Marcel Apfelbaum
Virtualization Bugs
: Regression
: 1135383 (view as bug list)
Depends On:
Blocks: 1056252 912287
  Show dependency treegraph
 
Reported: 2013-08-12 05:29 EDT by Chao Yang
Modified: 2014-08-31 21:43 EDT (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-03-23 08:50:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg from serial (50.92 KB, text/x-log)
2013-08-12 05:29 EDT, Chao Yang
no flags Details

  None (edit)
Description Chao Yang 2013-08-12 05:29:14 EDT
Description of problem:
Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then tried to resume it, but it took a long time to get resumed. 

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.382.el6.x86_64
seabios-0.6.1.2-28.el6.x86_64
2.6.32-407.el6.x86_64(both guest and host)


How reproducible:
100%

Steps to Reproduce:
1. boot a rhel guest by:
 /usr/libexec/qemu-kvm -name test -M rhel6.5.0 -cpu host -enable-kvm -m 4096 -smp 4,sockets=4,cores=2,threads=1,maxcpus=8 -rtc base=utc,clock=host,driftfix=slew -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device ich9-usb-ehci1,id=ehci,addr=3.0 -drive file=/home/usb-storage.qcow2,if=none,id=usb-storage,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device usb-storage,drive=usb-storage,id=usb_storage_1 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=4.0 -chardev socket,id=channel0,path=/tmp/socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=channel0,name=port0 -device virtio-scsi-pci,id=scsi,addr=5.0 -drive file=/home/scsi-storage.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device scsi-disk,bus=scsi.0,drive=drive-virtio-disk0,id=virtio-disk0,lun=0 -drive file=/home/ide.qcow2,if=none,id=ide,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,drive=ide,bus=ide.1,unit=1 -drive file=/home/device_interrupt.raw,if=none,id=drive-virtio-0-0,media=disk,format=raw,cache=none -device virtio-blk-pci,drive=drive-virtio-0-0,id=virt0-0-0,bootindex=1,addr=6.0 -netdev tap,id=hostnet1 -device e1000,netdev=hostnet1,id=net1,mac=00:1a:4a:42:48:12,bus=pci.0,addr=a.0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:48:ab,bus=pci.0,addr=7.0 -spice port=5900,disable-ticketing,seamless-migration=on -k en-us -vga cirrus -vnc :1 -device intel-hda,id=sound0,bus=pci.0,addr=8.0 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=9.0 -monitor stdio -serial unix:/tmp/serial,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2. suspend to disk by:
 echo disk > /sys/power/state

3. resume it with exactly same cli in step 1

Actual results:
It took more than 8 minutes. 

Expected results:


Additional info:
1. It looked like hung at "Suspending console(s) (use no_console_suspend to debug)" for long time.
2. Guest could catch up to real time after resuming from S4
Comment 1 Chao Yang 2013-08-12 05:29:56 EDT
Created attachment 785624 [details]
dmesg from serial
Comment 3 Chao Yang 2013-08-12 06:01:08 EDT
I also tested S3 with same cli, guest got resumed quickly instead of taking many minutes.
Comment 4 Ademar Reis 2013-08-12 18:37:12 EDT
(In reply to chayang from comment #0)
> Description of problem:
> Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then
> tried to resume it, but it took a long time to get resumed. 
> 

Looks like a regression then, can you confirm? Even though we don't support S3/S4, it would be good to keep this use-case working, specially considering it's RHEL6/RHEL6 setup.
Comment 5 Chao Yang 2013-08-12 23:14:20 EDT
(In reply to Ademar de Souza Reis Jr. from comment #4)
> (In reply to chayang from comment #0)
> > Description of problem:
> > Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then
> > tried to resume it, but it took a long time to get resumed. 
> > 
> 
> Looks like a regression then, can you confirm? Even though we don't support
> S3/S4, it would be good to keep this use-case working, specially considering
> it's RHEL6/RHEL6 setup.

Yes, this is a regression bug. 
Tried with qemu-kvm-0.12.1.2-2.355.el6.x86_64 using exactly same CLI in Comment 0. It took guest less than 20 seconds to resume from S4. Adding 'Regression' keyword.
Comment 7 Amit Shah 2013-08-13 07:09:29 EDT
Please narrow down the builds which introduce the regression.  Right now we know -355 is the good build and -382 is the bad one.
Comment 8 Chao Yang 2013-08-14 10:51:01 EDT
(In reply to Amit Shah from comment #7)
> Please narrow down the builds which introduce the regression.  Right now we
> know -355 is the good build and -382 is the bad one.

I retested with -382 and -381, this time it just took about 1 minute to resume. But with -377, it only took about 5 seconds to resume.
Comment 9 Qunfang Zhang 2013-08-14 22:50:10 EDT
Some of our guys say it takes more than half an hour to resume a guest from S4. So our cases including S4 steps will be blocked.
Comment 10 langfang 2013-08-14 23:11:51 EDT
Hit this problem on the latest version:
Host:
# uname -r
2.6.32-412.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.387.el6.x86_64

Guest:
2.6.32-412.el6.x86_64

Steps:
1.Boot guest:
 /usr/libexec/qemu-kvm -M rhel6.5.0 -m 4G -smp 2 -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/mnt/rhel6.5-newinstall.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=flang,bootindex=1 -spice port=5840,disable-ticketing -vga qxl -qmp tcp:0:5556,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/tty0,server,nowait  -boot menu=on -monitor stdio -device virtio-balloon-pci,bus=pci.0,id=balloon0  -netdev tap,vhost=on,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,mac=00:10:20:2c:45:23,bus=pci.0,addr=0x4,id=net0 -drive file=/home/RHEL6.4-20130123.0-Server-x86_64-DVD1.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw  -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x7  -chardev socket,id=channel0,path=/tmp/serial,server,nowait -device virtserialport,chardev=channel0,name=org.linux-kvm.port.0,bus=virtio-serial0.0,id=port1

2.Do S4

Results:

Do S4(wait about 4 min,then qemu quit)-->Boot guest with same CLI---> wait about 20 min,still can not resume
Comment 13 Amit Shah 2013-08-21 04:31:01 EDT
(In reply to chayang from comment #8)
> (In reply to Amit Shah from comment #7)
> > Please narrow down the builds which introduce the regression.  Right now we
> > know -355 is the good build and -382 is the bad one.
> 
> I retested with -382 and -381, this time it just took about 1 minute to
> resume. But with -377, it only took about 5 seconds to resume.

The difference in 377 and 381 doesn't highlight anything that touches acpi or anything that should affect S4.
Comment 18 langfang 2013-08-27 08:13:36 EDT
Test this on latest version:

Host:
# uname -r 
2.6.32-414.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.398.el6.x86_64
# rpm -q seabios
seabios-0.6.1.2-28.el6.x86_64


Guest:Windows


Results:
win8(32)--->S4--->resume--->successfully
win8(64)--->S4--->resume---->successfully
win7(32)---->S4--->resume--->successfully
win7(64)--->S4-->resume--->BSOD

Bug 1001616 - win7-64 guest bsod while enter s4 state
Comment 22 Qunfang Zhang 2014-08-31 21:43:38 EDT
*** Bug 1135383 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.