Bug 647115 - guest cannot resume from S4
guest cannot resume from S4
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.6
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Zachary Amsden
Virtualization Bugs
general operation
: Triaged
: 701606 707839 (view as bug list)
Depends On:
Blocks: Rhel5KvmTier3 580954 716706
  Show dependency treegraph
 
Reported: 2010-10-27 05:26 EDT by Chao Yang
Modified: 2013-01-09 18:17 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 716706 (view as bug list)
Environment:
Last Closed: 2011-06-26 11:01:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
when resume from S4 (6.27 KB, image/png)
2010-10-28 04:25 EDT, Chao Yang
no flags Details
kernel console output (228.26 KB, image/png)
2010-10-28 05:18 EDT, Chao Yang
no flags Details
kernel console output (133.61 KB, image/png)
2010-10-28 05:43 EDT, Chao Yang
no flags Details
messages when resume from s4 (10.67 KB, application/octet-stream)
2011-01-27 07:38 EST, Chao Yang
no flags Details
s4 fails with -smp 1 (24.34 KB, text/plain)
2011-05-13 09:39 EDT, Chao Yang
no flags Details
launch guest with one cpu, (22.27 KB, text/plain)
2011-05-13 09:42 EDT, Chao Yang
no flags Details

  None (edit)
Description Chao Yang 2010-10-27 05:26:36 EDT
Description of problem:


Version-Release number of selected component (if applicable):

--Host

#rpm -qa | grep kvm
etherboot-zroms-kvm-5.4.4-13.el5
kvm-qemu-img-83-204.el5
kmod-kvm-83-204.el5
kvm-tools-83-204.el5
kvm-83-205.el5
etherboot-roms-kvm-5.4.4-13.el5
kvm-debuginfo-83-204.el5

#uname -r
2.6.18-227.el5

#rpm -qa | grep spic
qspice-0.3.0-56.el5
qspice-debuginfo-0.3.0-56.el5
qspice-libs-0.3.0-56.el5

-Guest
#uname -r
2.6.18-194.el5


How reproducible:


Steps to Reproduce:
1.boot guest with spice and qxl
#/usr/libexec/qemu-kvm -M rhel5.6.0 -m 2G -smp 2 -drive
file=/root/chayang/testcasefor5u6.raw,if=ide,format=raw,boot=on,cache=none,werror=stop
-net nic,vlan=0,macaddr=24:23:12:25:b1:5a,model=e1000 -net
tap,vlan=0,script=/etc/qemu-ifup -boot c -monitor stdio -spice
host=0,ic=on,port=5930,disable-ticketing -qxl 1

2.connect to guest via spice client
# spice -h 10.66.91.43 -p 5930
3.Do s4 in guest
echo disk > /sys/power/state

3.
  
Actual results:
guest can't resume successfully,details info please have a look at attachment

Expected results:

guest resume successfully

Additional info:
Comment 1 Gleb Natapov 2010-10-27 15:10:48 EDT
Retry without spice.
Comment 2 Chao Yang 2010-10-27 21:48:07 EDT
Hit the same issue with vnc
Comment 3 Gleb Natapov 2010-10-28 03:00:17 EDT
(In reply to comment #2)
> Hit the same issue with vnc
There is not attachment to look at. Try using another nic model.
Comment 4 Chao Yang 2010-10-28 04:25:04 EDT
Created attachment 456180 [details]
when resume from S4
Comment 5 Gleb Natapov 2010-10-28 04:32:06 EDT
(In reply to comment #4)
> Created attachment 456180 [details]
> when resume from S4

Are you doing resume from console or from X? If from X try doing it from console. Also redirect kernel console output to ttyS0 and capture it on the host. Can you ssh into the guest after resume?
Comment 6 Chao Yang 2010-10-28 05:18:27 EDT
Created attachment 456193 [details]
kernel console output

1.Are you doing resume from console or from X?
  Doing resume from X.Hit again from console.
2.redirect kernel console output to ttyS0 and capture it
  Please look at attachment.Rusume from console is the same kernel console output with from x
3.Can you ssh into the guest after resume?
  Failed to ssh into the guest after resume
Comment 7 Gleb Natapov 2010-10-28 05:26:13 EDT
(In reply to comment #6)
> Created attachment 456193 [details]
> kernel console output
> 
> 1.Are you doing resume from console or from X?
>   Doing resume from X.Hit again from console.
> 2.redirect kernel console output to ttyS0 and capture it
>   Please look at attachment.Rusume from console is the same kernel console
From the attachment I see you are using virtio net. This is not suppose to work.
Use something else.
Comment 8 Chao Yang 2010-10-28 05:41:57 EDT
Comment on attachment 456193 [details]
kernel console output

I am sorry for that mistake.
I file this bug using e1000 net,then tried with rtl8139 net,still hit this issue.
Comment 9 Chao Yang 2010-10-28 05:43:29 EDT
Created attachment 456202 [details]
kernel console output
Comment 10 Gleb Natapov 2010-10-28 05:47:40 EDT
Try older guest. S4 resume problems in most cases are guest bugs. In case of Linux guest I don't remember it ever was kvm bug.
Comment 11 Chao Yang 2010-10-29 01:26:01 EDT
(In reply to comment #10)
> Try older guest. S4 resume problems in most cases are guest bugs. In case of
> Linux guest I don't remember it ever was kvm bug.

1.
CLI:/usr/libexec/qemu-kvm -M rhel5.6.0 -m 2G -smp 2 -drive file=/root/chayang//testcasefor5u6.raw,if=ide,format=raw,boot=on,cache=none,werror=stop -net nic,vlan=0,macaddr=24:23:12:25:b1:5a,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup -boot c -monitor stdio -vnc :18

I tried with older guest kernel on rhel5.6 host,resume from S4 successfully.
  guest kernel:2.6.18-164.el5
  #dmesg|grep -i kvm
  I did not see " time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer." in dmesg but "kvm_get_tsc_khz:cpu 0,msr 0:2401001"

2.
CLI:/usr/libexec/qemu-kvm -M rhel6.0.0 -m 2G -smp 2 -drive file=/root/testcasefor5u6.raw,if=ide,format=raw,boot=on,cache=none,werror=stop -net nic,vlan=0,macaddr=24:23:12:25:b1:5a,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup -boot c -monitor stdio -vnc :19

Also tried guest kernel 2.6.18-194.el5 on rhel6 host,can resume from S4.
  run #dmesg|grep -i kvm,it prints " time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer." in dmesg.
 
boot guest kernel 2.6.18-164.el5 on rhel6 host,can resume from S4,too.
  run #dmesg|grep -i kvm,it prints "kvm_get_tsc_khz:cpu 0,msr 0:238c001"


NOTE:I did these two steps without adding no-kvmclock to guest kernel parameters
Comment 12 Gleb Natapov 2010-10-31 03:19:09 EDT
Glauber, can you look at comment above please? Any ideas?
Comment 14 RHEL Product and Program Management 2011-01-11 15:44:22 EST
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Comment 15 RHEL Product and Program Management 2011-01-11 17:55:23 EST
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.
Comment 19 Chao Yang 2011-01-27 07:38:06 EST
Created attachment 475595 [details]
messages when resume from s4
Comment 20 Glauber Costa 2011-03-16 11:54:45 EDT
Please try booting your guest with clock=pmtmr. We need to rule out a clock issue here.
Comment 21 Chao Yang 2011-03-17 08:07:55 EDT
(In reply to comment #20)
> Please try booting your guest with clock=pmtmr. We need to rule out a clock
> issue here.

Glauber,
 I have tested for 14 times with clock=pmtmr, 7 for guest with deskop, 7 for guest without deskop, this issue disappears.

Host kernel: 2.6.18-238.el5
Guest kernel: 2.6.18-238.el5
kvm version:
# rpm -qa|grep kvm
kvm-tools-83-227.el5
kvm-83-227.el5
kmod-kvm-83-224.el5
kvm-debuginfo-83-227.el5
etherboot-zroms-kvm-5.4.4-13.el5
kvm-qemu-img-83-227.el5
Comment 22 Gleb Natapov 2011-05-12 05:12:15 EDT
Works with pmtmr. Looks like kvmclock problem.
Comment 23 Zachary Amsden 2011-05-12 05:30:39 EDT
(In reply to comment #11)
> (In reply to comment #10)
> > Try older guest. S4 resume problems in most cases are guest bugs. In case of
> > Linux guest I don't remember it ever was kvm bug.
> 
> 1.
> CLI:/usr/libexec/qemu-kvm -M rhel5.6.0 -m 2G -smp 2 -drive
> file=/root/chayang//testcasefor5u6.raw,if=ide,format=raw,boot=on,cache=none,werror=stop
> -net nic,vlan=0,macaddr=24:23:12:25:b1:5a,model=e1000 -net
> tap,vlan=0,script=/etc/qemu-ifup -boot c -monitor stdio -vnc :18
> 
> I tried with older guest kernel on rhel5.6 host,resume from S4 successfully.
>   guest kernel:2.6.18-164.el5
>   #dmesg|grep -i kvm
>   I did not see " time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer." in dmesg
> but "kvm_get_tsc_khz:cpu 0,msr 0:2401001"

So it looks like we have a bug in RHEL 5.6 KVM clock, which may or may not be fixable, which has been corrected in RHEL 6.

The bug happens with kernels which use KVM clock.

We are either missing something from RHEL 5.6 host or 5.6 guest which is fixed in later kernels.  There is a possibility that we can't fix the RHEL 5.6 host kernel at all to work around whatever is causing this bug; the code change in RHEL 6 with improved timer infrastructure may not be possible to backport, and at this point, is probably too complex to manage in time for a RHEL 5 kernel release.

If we can't find out what is causing this soon, my recommendation is going to be disabling KVM clock in RHEL 5.6 kernels.

One important point of data - is the guest kernel 32-bit or 64-bit?
Second important point - does the bug reproduce with -smp 1 ?
Comment 24 Chao Yang 2011-05-13 09:36:33 EDT
(In reply to comment #23)

> 
> One important point of data - is the guest kernel 32-bit or 64-bit?
So far, I haven't reproduced this bug on 32-bit(tried on RHEL5.6-32 and RHEL5.7-32 guest), seems only happens on 64-bit.
RHEL-Server-5.7-32.qcow2 RHEL-Server-5.6-32.qcow2

> Second important point - does the bug reproduce with -smp 1 ?

Yes, can reproduce with -smp 1 on x86_64 guest, will attach the log.
CLI: /usr/libexec/qemu-kvm -M rhel5.6.0 -no-hpet -rtc-td-hack -startdate now -name rhel5.7 -smp 1 -m 2048 -cpu qemu64,+sse2 -uuid `uuidgen` -boot c -net nic,vlan=1,macaddr=F0:4D:A2:24:ad:89,model=e1000 -net tap,vlan=1,script=/etc/qemu-ifup -drive file=/root/virtual-NIC/rhel5.7-64.qcow2,media=disk,if=ide,cache=none,boot=on,format=qcow2 -vnc :1 -notify all -balloon none -monitor stdio -serial unix:/tmp/test.sock,server,nowait
Comment 25 Chao Yang 2011-05-13 09:39:05 EDT
Created attachment 498765 [details]
s4 fails with -smp 1
Comment 26 Chao Yang 2011-05-13 09:42:49 EDT
Created attachment 498766 [details]
launch guest with one cpu,
Comment 27 Chao Yang 2011-05-13 09:45:40 EDT
(In reply to comment #25)
> Created attachment 498765 [details]
> s4 fails with -smp 1

Ignore Comment #25, attachment 498765 [details] generated by two CPUs guest.
Comment 28 Zachary Amsden 2011-05-13 13:22:33 EDT
(In reply to comment #24)
> (In reply to comment #23)
> 
> > 
> > One important point of data - is the guest kernel 32-bit or 64-bit?
> So far, I haven't reproduced this bug on 32-bit(tried on RHEL5.6-32 and
> RHEL5.7-32 guest), seems only happens on 64-bit.
> RHEL-Server-5.7-32.qcow2 RHEL-Server-5.6-32.qcow2

Sounds to me like we are missing a patch for 64-bit RHEL 5, which has already been applied on 32-bit.  Quite easy to do as the 32-bit and 64-bit kernels here are separate and I believe there were a bunch of kvmclock patches backported from upstream.
Comment 29 Glauber Costa 2011-05-17 13:30:36 EDT
*** Bug 701606 has been marked as a duplicate of this bug. ***
Comment 30 Zachary Amsden 2011-05-19 04:33:52 EDT
After investigating further, and discovering I was looking at the wrong tree, I found that this bug should already be fixed.

>-Guest
>#uname -r
>2.6.18-194.el5

This guest is too old to use kvmclock effectively on an SMP guest; it is missing the atomic backwards protection which was added later.

Can you retry with an updated el5 kernel?  I looked and the proper fixes are in the following:

2.6.18-238.9.1.el5

I'm still a bit confused there the 5.6 kernels on the install media come from, but if they are recent enough to be updated to -194, they should be recent enough to be updatable to -238 as well.
Comment 31 Zachary Amsden 2011-05-19 13:53:21 EDT
It really looks like this should be fixed, please verify with an updated guest kernel.
Comment 32 Chao Yang 2011-05-20 04:52:53 EDT
(In reply to comment #31)
> It really looks like this should be fixed, please verify with an updated guest
> kernel.

I tested twice , first s4 works fine, but the second time, it stuck at :
Trying to resume from /dev/VolGroup00/LogVol01
Resuming from /dev/VolGroup00/LogVol01.
Attempting manual resume
Disabling non-boot CPUs ...
CPU 1 is now offline
SMP alternatives: switching to UP code
CPU1 is down
Stopping tasks: ======|
Shrinking memory... done (0 pages freed)
Loading image data pages (68039 pages) ... done
Read 272156 kbytes in 5.24 seconds (51.93 MB/s)

And cpu usage is :
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11565 root      15   0 2262m 351m 3436 S 99.9  4.4   4:59.02 qemu-kvm     

host: # uname -a
Linux localhost.localdomain 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
guest: # uname -a
uname -a
Linux localhost.localdomain 2.6.18-238.12.1.el5 #1 SMP Sat May 7 20:18:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

# dmesg|grep -i time.c
dmesg|grep -i time.c
time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer.
time.c: Detected 2666.754 MHz processor.
Real Time Clock Driver v1.12ac


CLI:
/usr/libexec/qemu-kvm -M rhel5.6.0 -no-hpet -rtc-td-hack -startdate now -name rhel5.6 -smp 2 -m 2048 -cpu qemu64,+sse2 -uuid `uuidgen` -boot c -net nic,vlan=1,macaddr=64:31:50:43:49:45,model=e1000 -net tap,vlan=1,script=/etc/qemu-ifup -drive file=RHEL-Server-5.6-64.qcow2,media=disk,if=ide,cache=none,boot=on,format=qcow2 -vnc :1 -notify all -balloon none -monitor stdio -serial unix:/tmp/chayang.unix,server,nowait
Comment 33 Chao Yang 2011-05-20 04:54:36 EDT
And the networking is not reachable:
# ping 10.66.9.185
PING 10.66.9.185 (10.66.9.185) 56(84) bytes of data.
From 10.66.11.212 icmp_seq=2 Destination Host Unreachable
From 10.66.11.212 icmp_seq=3 Destination Host Unreachable
From 10.66.11.212 icmp_seq=4 Destination Host Unreachable

--- 10.66.9.185 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3999ms
, pipe 3
Comment 34 Zachary Amsden 2011-05-20 13:26:33 EDT
It's not clear what the failure rate was before, but it appeared to be 100%, now it is not, and a hang during S4 resume could be caused by any number of kernel changes.  Can we implicate or rule out the clocksource again with the new kernel by testing with clock=pmtmr
Comment 35 Chao Yang 2011-05-23 01:56:22 EDT
(In reply to comment #34)
> It's not clear what the failure rate was before, but it appeared to be 100%,
> now it is not, and a hang during S4 resume could be caused by any number of
> kernel changes.  Can we implicate or rule out the clocksource again with the
> new kernel by testing with clock=pmtmr

I tested for 30 times with kernel 2.6.18-238.12.1.el5(x86_64).

Clock        Times             Result 
pmtmr        15                ALL PASS
kvmclock     15                6 FAIL, 9 PASS
Comment 36 Zachary Amsden 2011-05-23 12:20:58 EDT
It's not clear how to proceed on this bug at this point in time.  Yes, it's a real bug, and yes, it is fixed by moving to pmtmr.  It should be fixed when running on a RHEL 6.1 hypervisor, but earlier RHEV releases may cause problems.

There are two problems that conspire to cause the bug, the first was the lack of backwards protection in the guest, which has already been fixed by updating the kernel.  The second problem is that the hypervisor is missing S4 suspend kvm clock compensation, which was added in RHEL 6.1, and not present before.

It's not going to be possible to easily backport that code into RHEL5, certainly not in time for this release.  The S4 suspend compensation was an incremental improvement built on major infrastructure work on both kvm clock and guest timekeeping in general, and also depends on other pieces of infrastructure (high-res clocksource changes) which are extremely risky and complex to backport.

The way I see it, we have essentially 4 choices:

1) document the bug and known workaround - as most people running virtual machines are not going to be putting their systems into S4 sleep anyway, this may be an acceptable solution.

2) add patches to either disable or default kvmclock to off when running under a RHEL5 hypervisor.  Not sure if we publish a recognizable version field, so this may have the undesirable side effect of turning off KVM clock even when it is fully usable.

3) a complex and tedious backport of the the whole kvmclock and clocksource improvements.  This is by far the riskiest option.

4) a selective backport of just the S4 clock compensation into RHEL5; it may be possible, but the code involved is subtle and hasn't been tested in that order of application.  Seeing as the S4 compensation isn't even upstream yet, this is also a risky choice for a RHEL release, especially a RHEL5 update.

Requesting additional opinions about which path to proceed down, but my vote is #1.
Comment 37 Glauber Costa 2011-05-23 14:38:03 EDT
I think #1 is better as well.
Comment 38 Xiaoqing Wei 2011-05-26 02:40:22 EDT
*** Bug 707839 has been marked as a duplicate of this bug. ***
Comment 39 Zachary Amsden 2011-06-08 12:53:30 EDT
Actually, delving further into this... I was under the mistaken impression that the guest failed when the host was put into S4 suspend.  That still won't be possible on a RHEL5 hypervisor, but it is possible on a RHEL6 hypervisor - however, this is a completely separate bug, and one that is probably a WONTFIX for RHEL5 and a feature improvement for RHEL6.  The solutions I proposed in Comment 36 were based on this misunderstanding.

However, what's going on in this bug is actually a GUEST S4 suspend / resume.  The fact that a kvmclock guest can't come back from this is certainly a guest bug, not a hypervisor issue, so it would required a RHEL5 guest kernel patch.  I don't believe S4 suspend was an original design parameter for KVM clock, and there are a number of things that could go wrong along the resume path.  It's not clear whether the bug is low probability, or possibly fixed on 32-bit and not 64-bit, but there are some other factors at work here causing it to work in some configurations and not in others.

Our recommendation is almost certainly going to be - don't do S4 suspend if you use KVM clock.  It's entirely unnecessary, as you can do loadvm / savevm, which provides nearly the same facility.

So our actual choices then are going to be:

1) disable S4 suspend when KVM clock is in use
2) disable kvmclock
3) diagnose and fix the problem (which is still an issue in 6.1 - see BZ 694801, comment 4 - and thus likely also upstream as well), get the fix upstream, then backport the fix to all of the 5.6 and later releases).
4) document the problem and recommend not using kvmclock and S4 suspend in combination when running in a VM in the 5.6 release notes

Well obviously #3 is the best choice, as far as 5.6, that ship has sailed, and there isn't sufficient time to do anything at all about it.  For now, if we do anything at all to change the guest kernel (#1, #2, or #3), it is going to take a while to get into the next update.

Given that state of affairs, the relative obscurity of the issue, I would propose not rushing this, taking approach #3 and #4 in parallel, fixing it properly updstream, documenting it as a known issue, and backporting changes only if they can be shown to be either low risk or highly demanded by users of 5.x kernels.

Glauber, do you remember any bugs with S4 resume of a kvmclock guest, or any patches that might have missed one of the 64-bit kernel paths and been fixed on 32-bit?  I'll go back and look over the code again now with a proper understanding of the bug.
Comment 40 Zachary Amsden 2011-06-21 22:15:46 EDT
I think the root of the problem here is that kvmclock doesn't support a clocksource resume method...
Comment 41 Dor Laor 2011-06-26 11:01:55 EDT
I have cloned this against rhel6 and I will close it for rhel5

Note You need to log in before you can comment on or make changes to this bug.