Bug 754128
| Summary: | Shutting Down VM changes its state to "Pause" for 10sec | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Ortal <ogvura> | ||||
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
| Status: | CLOSED ERRATA | QA Contact: | yeylon <yeylon> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 6.1 | CC: | acathrow, ajia, dallan, danken, hateya, jdenemar, msvoboda, mzhan, rwu, srevivo, ydu, zhpeng | ||||
| Target Milestone: | rc | Keywords: | ZStream | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-0.9.9-0rc1.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
When shutting down, a virtual machine had changed its status from the "Up" state to the "Paused" state before it was shutdown. The "Paused" state represented the state when the guest had been already stopped, but QEMU was flushing its internal buffers and was waiting for libvirt to kill it. But this state change confused users so this update adds respective events and modifies libvirt to use the "shutdown" state. A virtual machine now moves from the "Up" to "Powering Down" and then to "Down" state.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-06-20 06:36:44 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 770959, 773650, 773651, 773677, 773696 | ||||||
| Attachments: |
|
||||||
Created attachment 533769 [details]
logs.tar
I don't think this is a bug. One needs to distinguish basically three states on guest shutdown: guest runing; guest OS halted but qemu preparing to be killed (e.g. flushing its internal caches); qemu halted;
When transitioning from the 1st do the 2nd state, libvirt emits an event {PAUSED, VIR_DOMAIN_PAUSED_SHUTTING_DOWN}. This in fact reflects reality where guest OS is not responding anymore, because we are performing shutdown. Please note, that this state can last very long, e.g. when qemu wants to flush data on stale NFS. We definitely can not show VM as running during this stage.
However, if there is wider consensus this is a bug, I think vdsm could change reporting of state to RHEV;
I can reproduce it with: kernel-2.6.32-221.el6.x86_64 libvirt-0.9.4-23.el6.x86_64 libvirt-client-0.9.4-23.el6.x86_64 libvirt-python-0.9.4-23.el6.x86_64 qemu-img-0.12.1.2-2.210.el6.x86_64 qemu-kvm-0.12.1.2-2.210.el6.x86_64 qemu-kvm-tools-0.12.1.2-2.210.el6.x86_64 gpxe-roms-qemu-0.9.7-6.9.el6.noarch vdsm-4.9-110.el6.x86_64 vdsm-cli-4.9-110.el6.x86_64 the same log result *** Bug 758648 has been marked as a duplicate of this bug. *** We agreed that we should actually use our (currently unused by qemu driver) "shutdown" state instead of "paused" and issue the event accordingly. With this change, states will transition as follows: running -> shutdown -> shutoff I'll send patches upstream soon. This is now fixed upstream by v0.9.8-rc2-5-g38527c9:
commit 38527c9ae0d398a996e1e3c0897a104e3ffd1256
Author: Jiri Denemark <jdenemar>
Date: Wed Nov 30 15:31:45 2011 +0100
qemu: Rework handling of shutdown event
When QEMU guest finishes its shutdown sequence, qemu stops virtual CPUs
and when started with -no-shutdown waits for us to kill it using
SGITERM. Since QEMU is flushing its internal buffers, some time may pass
before QEMU actually dies. We mistakenly used "paused" state (and
events) for this which is quite confusing since users may see a domain
going to pause while they expect it to shutdown. Since we already have
"shutdown" state with "the domain is being shut down" semantics, we
should use it for this state.
However, the state didn't have a corresponding event so I created one
and called its detail as VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED (guest OS
finished its shutdown sequence) with the intent to add
VIR_DOMAIN_EVENT_SHUTDOWN_STARTED in the future if we have a
sufficiently capable guest agent that can notify us when guest OS starts
to shutdown.
Note, that vdsm/rhev will need to be adapted to this change. The "shutdown" state is in libvirt for ages but it hasn't been used by qemu driver until now so there's a chance vdsm doesn't handle it. And even if it can cope with this state, it will certainly need to be modified to handle the new shutdown event. Since this bug has visible effects on end rhev-3.0 users, I would like to request backporting this to 6.2.z. However, (In reply to comment #10) > Note, that vdsm/rhev will need to be adapted to this change. The "shutdown" > state is in libvirt for ages but it hasn't been used by qemu driver until now > so there's a chance vdsm doesn't handle it. And even if it can cope with this > state, it will certainly need to be modified to handle the new shutdown event. changing vdsm to cope with this change in an async z-stream may require some thinking. An unexpected event is of no problem to Vdsm. The only problem I see is if Vdsm is restarted and finds a Vm in shutdown state - since it is not VIR_DOMAIN_RUNNING, we would still report the Vm as Paused. Test with libvirt-0.9.9-0rc1.el6.x86_64, Bug Verified. Shut down a running guest in RHEVM, the guest's status will transition as follows: Up -> Powering Down -> Down.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
When shutting down, a virtual machine had changed its status from the "Up" state to the "Paused" state before it was shutdown. The "Paused" state represented the state when the guest had been already stopped, but QEMU was flushing its internal buffers and was waiting for libvirt to kill it. But this state change confused users so this update adds respective events and modifies libvirt to use the "shutdown" state. A virtual machine now moves from the "Up" to "Powering Down" and then to "Down" state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0748.html |
Description of problem: LIBVIRT [libvirt-0.9.4-21.el6.x86_64] - Shutting Down VM changes its state to "Pause" for 10sec Version-Release number of selected component (if applicable): RHEVM Version ic148 Host: RHEL6.2 with updates [root@master-vds13 log]# rpm -qa | grep vdsm vdsm-4.9-110.el6.x86_64 vdsm-cli-4.9-110.el6.x86_64 [root@master-vds13 log]# rpm -qa | grep libvirt libvirt-python-0.9.4-21.el6.x86_64 libvirt-0.9.4-21.el6.x86_64 libvirt-client-0.9.4-21.el6.x86_64 [root@master-vds13 log]# rpm -qa | grep qemu gpxe-roms-qemu-0.9.7-6.9.el6.noarch qemu-kvm-debuginfo-0.12.1.2-2.207.el6.x86_64 qemu-kvm-0.12.1.2-2.207.el6.x86_64 qemu-kvm-tools-0.12.1.2-2.207.el6.x86_64 qemu-img-0.12.1.2-2.207.el6.x86_64 How reproducible: Always Steps to Reproduce: 1. Open RHEVM Admin Portal and attempt to shut down gracefully the VM. 2. The VM change status from "Up" to "preparing to down" 3. The VM status moved to "Paused" for 10 secs 4. The VM shutdown. Actual results: The VM shouldn't move to Pause state, is there a change you did??? Thread-45490::DEBUG::2011-11-14 10:33:42,441::task::495::TaskManager.Task::(_debug) Task 60c078f2-0117-4d34-bd2b-020066d0097f: moving from state preparing -> state finished Thread-45490::DEBUG::2011-11-14 10:33:42,442::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-45490::DEBUG::2011-11-14 10:33:42,442::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-45490::DEBUG::2011-11-14 10:33:42,442::task::495::TaskManager.Task::(_debug) Task 60c078f2-0117-4d34-bd2b-020066d0097f: ref 0 aborting False Thread-45490::INFO::2011-11-14 10:33:42,443::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: repoStats, Return response: {'status': {'message': 'OK', 'code': 0}, u'257425d7-ad6c-4569-89f6-45b52 e6d0a4f': {'delay': '0.0251710414886', 'lastCheck': 1321259613.923393, 'valid': True, 'code': 0}, u'8c6c9c77-4525-454b-91c0-f02bd06511c5': {'delay': '0.0909280776978', 'lastCheck': 1321259621.8530221, 'valid': Tru e, 'code': 0}, u'36141546-c0ce-4d05-ab43-67e4b0c93e1f': {'delay': '0.0122480392456', 'lastCheck': 1321259613.923548, 'valid': True, 'code': 0}, u'ab800c64-3caa-4c28-b61f-c3073d487c50': {'delay': '0.00139307975769' , 'lastCheck': 1321259621.2020209, 'valid': True, 'code': 0}} libvirtEventLoop::DEBUG::2011-11-14 10:33:42,696::libvirtvm::1369::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`a3e50c9e-9518-4198-9ec9-dbe0a5ff14bd`::event Suspended detail 0 opaque None Thread-45008::ERROR::2011-11-14 10:33:43,284::guestIF::246::vm.Vm::(run) vmId=`a3e50c9e-9518-4198-9ec9-dbe0a5ff14bd`::Run exception: None Traceback (most recent call last): File "/usr/share/vdsm/guestIF.py", line 238, in run line = self._readLine() File "/usr/share/vdsm/guestIF.py", line 214, in _readLine self._readBuffer() File "/usr/share/vdsm/guestIF.py", line 200, in _readBuffer s = self._sock.recv(self.READSIZE) Expected results: The VM shouldn't move to "Pause" state during Shut Down process. Additional info: Attached relevant logs: libvirt, qemu, and vdsm