| Summary: | F15/F16 guest VMs hangs with 100% CPU when host resumes from long suspend | |||
|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Tomas Mraz <tmraz> | |
| Component: | kernel | Assignee: | Amit Shah <amit.shah> | |
| Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 16 | CC: | crobinso, gansalmon, ilari.stenroth, itamar, jonathan, kernel-maint, knoel, madhu.chinakonda, petersen | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 767498 947727 (view as bug list) | Environment: | ||
| Last Closed: | 2012-09-05 13:40:30 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 767498 | |||
|
Description
Tomas Mraz
2011-06-03 09:08:31 UTC
Yes rather annoying and has been happening from before F15Beta iirc. I don't think the host OS matters much: I have seen it on both F14 and F15 x86_64 hosts, but problem only occurs for F15 guests as Tomas also mentioned (both i686 and x86_64). I have tried leaving a "top -b" process running as you once suggested but haven't gotten an useful output yet from that. Is there anything else we can try to get more info on this problem? Increasing severity and priority in the hope this might get some attention. Also happens with f16 rawhide guests. Oh my rawhide guest resumed now! It is running kernel-3.0-0.rc7.git3.1.fc16.x86_64. It did not for me with kernel-3.0.0-1.fc16.x86_64. So no change here. Although the hang was not complete - ping works, even ssh can connect but not finish the login, gpm on text console can move the mouse cursor but getty does not react to keyboard. Yeah I think I was too optimistic too soon: after that I have still seen issues too. Happened again for me today with Fedora-16-Nightly-20111019.10-x86_64-Live-desktop guest. I wonder if this is related to guest clock steal time accounting. Linux 3.0-rc6 introduced steal time accounting to KVM. But if this happens with kernels prior 3.0 then it's probably not related to steal time accounting. I don't have a hibernateable system with KVM guests so I can not try out myself. If you will try to boot guest kernel with parameter "no-steal-acc" and see if something changes. No, I was seeing it with kernel-2.6.38 already. I'm unsure what version started it with though. Also I've always suspected some interaction of kernel with systemd to be the cause because in F14 we had upstart as init. I do not want to say that systemd is the culprit here I'd rather say that systemd does something with kernel that upstart did not do that triggers the bug in kernel. I'll try some distro with upstart init and recent kernel. Tried Linux Mint 12 with 3.0.0 kernel and upstart - does not hang. Currently updated F16 still hangs. The possible culprit might be usage of cgroups. Systemd uses them, upstart does not. Moving this to F16 for now. Is this still being seen in current F16/F17/rawhide? I did not see the hang recently. So if it happens at all it must be much much less often than before. But I've also changed the host machine hardware so it might be related to this change. Closing per comment #15. There were also recent changes to prevent libvirt from getting stuck on one CPU after resume. |