Bug 240413
Summary: | Guests get stuck in paused state when booting under heavy load | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||||
Component: | xen | Assignee: | Richard W.M. Jones <rjones> | ||||||||
Status: | CLOSED WORKSFORME | QA Contact: | |||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 9 | CC: | katzj, triage | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | bzcl34nup | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2008-09-09 13:07:52 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Richard W.M. Jones
2007-05-17 12:55:03 UTC
Created attachment 154911 [details]
xend.log
This is xend.log, cut down so it starts just before the guest is booted.
Domain of interest is ID 452, name fc6-3.
Created attachment 154912 [details]
xend-debug.log
This is xend-debug.log, cut down so it starts just before the guest is booted.
Domain of interest is ID 452, name fc6-3.
(A reminder to capture xenstore-ls output next time this happens) Created attachment 154917 [details]
Output of xenstore-ls with 3 domains paused this way
Now I seem to have a reliable way to reproduce this bug.
What I do is take a huge file (a 4GB disk image from one of the guests) and
copy it. Three domains were cycling while this was happening, and all 3 are
now stuck paused.
# /usr/sbin/xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 2984 4 r----- 23718.2
centos5 256 1 0.2
fc6 256 1 55.9
fc6-2 492 256 1 --p--- 0.0
fc6-3 493 256 1 --p--- 0.0
fc6-4 494 256 1 --p--- 0.0
freebsd32 256 1 0.0
There is a message produced when this happens; it comes from the xm start
command itself, and it confirms the theory that the hotplug scripts are timing
out:
+ /usr/sbin/xm start fc6-3
Error: Device 0 (vif) could not be connected. Hotplug scripts not working.
Usage: xm start <DomainName>
Start a Xend managed domain
-p, --paused Do not unpause domain after starting it
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. Assigning it to me to retest. Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping I retested with my load testing scripts a while back and didn't see anything like this, so I'm going to assume WORKSFORME for now. |