Hide Forgot
This happens for me in RHEL 6.2 beta with libvirt-0.9.4-13.el6.x86_64 After doing virsh shutdown on a few guests, they show up as paused: 44 builder_rhel6 paused 45 builder_rhel5 paused 46 builder_debian6 paused 47 builder_ubuntu1104 paused Original bug description follows ... +++ This bug was initially created as a clone of Bug #739895 +++ Description of problem: With the newest libvirt "virsh shutdown fooguest" results in a 'paused' state Version-Release number of selected component (if applicable): For F15, I made a scratch build using the libvirt-0.9.5-1.fc14.src.rpm from ftp://libvirt.org/libvirt/ (http://koji.fedoraproject.org/koji/taskinfo?taskID=3363547) ###################### [root@moon ~]# cat /etc/redhat-release Fedora release 15 (Lovelock) ###################### [root@moon ~]# rpm -q libvirt libvirt-client libvirt-0.9.5-1.fc15.x86_64 libvirt-client-0.9.5-1.fc15.x86_64 [root@moon ~]# ###################### How reproducible: All the time Steps to Reproduce: 1. create a new fedora-15 minimal guest (only @core) 2. stop the guest using virsh ##################### [root@moon ~]# virsh list Id Name State ---------------------------------- 5 f15foo10 running [root@moon ~]# ##################### [root@moon ~]# virsh shutdown f15foo10 ##################### [root@moon ~]# virsh list Id Name State ---------------------------------- 5 f15foo10 paused [root@moon ~]# ##################### Actual results: Guest shutdown result is 'paused' instead of shutdown Expected results: Guest should shutdown should happen successfully. Additional info: Attached is the output of 'virsh shutdown f15foo10' with LIBVIRT_DEBUG=1 --- Additional comment from kchamart on 2011-09-20 07:03:57 EDT --- Created attachment 524000 [details] virsh shutdown fooguest with LIBVIRT_DEBUG=1 --- Additional comment from jdenemar on 2011-09-20 08:15:28 EDT --- This is most likely a qemu bug. What does "virsh domstate --reason f15foo10" say when you see the guest is paused? What is the version of installed qemu package? --- Additional comment from kchamart on 2011-09-20 08:25:32 EDT --- Jiri, Domain state and qemu-kvm version here: ################### [root@moon ~]# virsh domstate --reason f15foo10 paused (shutting down) ################### [root@moon ~]# rpm -q qemu-kvm qemu-kvm-0.15.0-4.fc15.x86_64 [root@moon ~]# ################### --- Additional comment from jdenemar on 2011-09-20 08:52:52 EDT --- Thanks, the qemu-kvm package is most likely missing this patch: http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg01757.html --- Additional comment from berrange on 2011-09-20 09:01:41 EDT --- Hmm, this is not very good, because it means virsh shutdown is now broken for every released version of QEMU in existence surely ? I don't think we can go with such a widespread regression, even though the bug is technically in QEMU, we ought to avoid it in libvirt. --- Additional comment from jdenemar on 2011-09-20 09:28:15 EDT --- Not all QEMU versions. The bug was introduced sometime around 0.14.0-rc. Earlier versions should work fine I believe. We could avoid it by not using -no-shutdown for QEMU 0.14 and 0.15 (I don't see a better way of detecting if the bug is there or not). --- Additional comment from veillard on 2011-09-20 09:48:14 EDT --- Confirmed, on F14 with qemu-kvm-0.13.0-1.fc14.x86_64, virsh shutdown seems to works normally: [root@paphio ~]# virsh list Id Name State ---------------------------------- 1 RHEL-5.4-64 running [root@paphio ~]# virsh shutdown RHEL-5.4-64 Domain RHEL-5.4-64 is being shutdown [root@paphio ~]# rpm -q qemu-kvm qemu-kvm-0.13.0-1.fc14.x86_64 [root@paphio ~]# virsh list Id Name State ---------------------------------- [root@paphio ~]# virsh list --all Id Name State ---------------------------------- - RHEL-5.4-64 shut off Daniel --- Additional comment from jdenemar on 2011-09-21 16:45:23 EDT --- This issue should be fixed upstream by: commit f84aedad090da1e05ccc5651815febba013eb3ad Author: Jiri Denemark <jdenemar> Date: Wed Sep 21 10:25:29 2011 +0200 qemu: Fix shutdown regression with buggy qemu The commit that prevents disk corruption on domain shutdown (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed only recently in QEMU git. The affected versions of QEMU do not quit on SIGTERM if started with -no-shutdown, which we use to implement fake reboot. Since -no-shutdown tells QEMU not to quit automatically on guest shutdown, domains started using the affected QEMU cannot be shutdown properly and stay in a paused state. This patch disables fake reboot feature on such QEMU by not using -no-shutdown, which makes shutdown work as expected. However, virDomainReboot will not work in this case and it will report "Requested operation is not valid: Reboot is not supported with this QEMU binary". --- Additional comment from jdenemar on 2011-09-22 05:10:30 EDT --- This patch is now included in libvirt-0.9.6 release. --- Additional comment from kchamart on 2011-09-22 07:34:47 EDT --- Thanks. VERIFIED I made a quick scratch build(for F15) using libvirt F14 SRPM http://koji.fedoraproject.org/koji/taskinfo?taskID=3369923 And shutting down the guest is now graceful. ########################################################################## [root@moon ~]# rpm -q libvirt libvirt-0.9.6-1.fc15.x86_64 ########################################################################## [root@moon ~]# virsh list Id Name State ---------------------------------- 7 f15foo10 running ########################################################################## [root@moon ~]# virsh shutdown f15foo10 ; virsh console f15foo10 Domain f15foo10 is being shutdown Connected to domain f15foo10 Escape character is ^] Killing mdmonitor: Stopping PC/SC smart card daemon (pcscd): Shutting down sm-client: [ 143.322077] systemd[1]: var-lib-nfs-rpc_pipefs.mount mount process exited, code=exited status=1 [ OK ] Stopping sshd: [ 143.475185] smartd[621]: smartd received signal 15: Terminated [ 143.475202] smartd[621]: smartd is exiting (exit status 0) [ 143.475213] acpid[644]: exiting [ 143.475233] modem-manager[672]: <info> Caught signal 15, shutting down... [ 143.475245] NetworkManager[600]: <warn> disconnected by the system bus. [ 143.475255] NetworkManager[600]: <info> caught signal 15, shutting down normally. [ 143.475273] NetworkManager[600]: <warn> quit request received, terminating... [ 143.475286] NetworkManager[600]: <info> exiting (success) [ 143.544614] sshd[797]: Received signal 15; terminating. Stopping RPC idmapd: [ 143.582970] systemd[1]: sshd.service: main process exited, code=exited, status=255 Stopping NFS statd: Shutting down sendmail: [ 143.599799] rpc.statd[825]: Caught signal 15, un-registering and exiting [ 143.604281] systemd[1]: nfslock.service: main process exited, code=exited, status=1 [ OK ] [ OK ] [ 143.665284] systemd[1]: Unit sshd.service entered failed state. [ OK ] [ 143.737185] systemd[1]: Unit nfslock.service entered failed state. [ 144.297288] systemd[1]: pcscd.service: main process exited, code=exited, status=1 [ OK ] [ 144.461158] systemd[1]: Unit pcscd.service entered failed state. Stopping rpcbind: [ 144.720246] rpcbind[720]: rpcbind terminating on signal. Restart with "rpcbind -w" [ 144.724478] systemd[1]: rpcbind.service: main process exited, code=exited, status=2 [ OK ] [ 144.818168] systemd[1]: Unit rpcbind.service entered failed state. Stopping auditd: [ 144.871352] type=1305 audit(1316690292.412:97): audit_pid=0 old=683 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1 [ 144.873205] auditd[683]: The audit daemon is exiting. [ OK ] [ 144.980974] type=1305 audit(1316690292.521:98): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditctl_t:s0 res=1 Not stopping monitoring, this is a dangerous operation. Please use force-stop to override. [ 145.095679] systemd[1]: lvm2-monitor.service: control process exited, code=exited status=1 [ 145.112677] systemd[1]: Unit lvm2-monitor.service entered failed state. iptables: Flushing firewall rules: ip6tables: Flushing firewall rules: [ OK ] [ OK ] iptables: Setting chains to policy ACCEPT: filter ip6tables: Setting chains to policy ACCEPT: filter [ OK ] iptables: Unloading modules: [ OK ] ip6tables: Unloading modules: [ OK ] [ OK ] [ 146.994142] systemd[1]: Shutting down. Sending SIGTERM to remaining processes... Sending SIGKILL to remaining processes... Unmounting file systems. [ 147.337958] EXT4-fs (dm-1): re-mounted. Opts: (null) Disabling swaps. Detaching loop devices. Detaching DM devices. Not all DM devices detached, 1 left. Detaching DM devices. Not all DM devices detached, 1 left. Cannot finalize remaining file systems and de[ 147.361770] md: stopping all md devices. vices, trying to kill remaining processes. Detaching DM devices. Not all DM devices detached, 1 left. Cannot finalize remaining file systems and devices, giving up. [ 148.364613] ACPI: Preparing to enter system sleep state S5 [ 148.366414] Disabling non-boot CPUs ... [ 148.369465] Unregister pv shared memory for cpu 1 [ 148.371393] Broke affinity for irq 4 [ 148.372091] Broke affinity for irq 43 [ 148.428899] Power down. ########################################################################## [root@moon ~]# virsh list Id Name State ---------------------------------- [root@moon ~]# ##########################################################################
qemu version is: qemu-kvm-0.12.1.2-2.192.el6.x86_64 Notice that I just updated the host from RHEL 6.1 to 6.2 beta, and I have not rebooted the guests or the host since the update.
The qemu version you are mentioning is the installed one, not the one which old guests are running on, right? If so, what qemu version is the one your guests use? We thought the bug in qemu only affected 6.2 development packages and not any package we already shipped, i.e., 6.1(.z).
First thing: the problem does NOT occur now that I have rebooted the host and all guests are using the latest qemu. When the problem did occur, the guests were running another version of qemu. However I do not know which precise version that was, since those guests could have been running for a long time. Here are the versions of qemu that have been installed on this host (note that at some point this was pulling packages from brew, so some of these are not RHEL released versions): Apr 13 16:20:41 Updated: 2:qemu-kvm-0.12.1.2-2.158.el6.x86_64 May 13 20:38:58 Updated: 2:qemu-kvm-0.12.1.2-2.160.el6.x86_64 Aug 18 15:08:26 Updated: 2:qemu-kvm-0.12.1.2-2.183.el6.x86_64 Sep 22 10:57:29 Updated: 2:qemu-kvm-0.12.1.2-2.190.el6.x86_64 Oct 19 23:06:12 Updated: 2:qemu-kvm-0.12.1.2-2.192.el6.x86_64 I then checked 'last' inside each guest, to find out when each was rebooted, and thus what version of qemu it was likely to be running: builder_rhel6: Oct 17 => qemu 190 builder_rhel5: Oct 17 => qemu 190 builder_debian6: Oct 17 => qemu 190 builder_ubuntu1104: Sep 23 => qemu 190 So they were probably running the unreleased 2:qemu-kvm-0.12.1.2-2.190.el6.x86_64 Does this qemu contain the bug?
OK, the answer to my question is yes, everything before qemu 192 contained the bug, according to the qemu changelog: * Tue Sep 20 2011 Michal Novotny <minovotn> - qemu-kvm-0.12.1.2-2.192.el6 [...] - kvm-Fix-termination-by-signal-with-no-shutdown.patch [bz#738487] Hence closing this bug now.
*** Bug 747708 has been marked as a duplicate of this bug. ***