747543 – "virsh shutdown fooguest" results in a 'paused' state

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 747543 - "virsh shutdown fooguest" results in a 'paused' state

Summary: "virsh shutdown fooguest" results in a 'paused' state

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.2
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Libvirt Maintainers
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	747708 (view as bug list)
Depends On:	739895
Blocks:
TreeView+	depends on / blocked

Reported:	2011-10-20 08:43 UTC by Richard W.M. Jones
Modified:	2011-10-21 18:14 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	739895
Environment:
Last Closed:	2011-10-20 09:51:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Richard W.M. Jones 2011-10-20 08:43:44 UTC

This happens for me in RHEL 6.2 beta with
libvirt-0.9.4-13.el6.x86_64

After doing virsh shutdown on a few guests, they
show up as paused:

 44 builder_rhel6        paused
 45 builder_rhel5        paused
 46 builder_debian6      paused
 47 builder_ubuntu1104   paused

Original bug description follows ...

+++ This bug was initially created as a clone of Bug #739895 +++

Description of problem:
With the newest libvirt "virsh shutdown fooguest"  results in a 'paused' state

Version-Release number of selected component (if applicable):
For F15, I made a scratch build using the libvirt-0.9.5-1.fc14.src.rpm from ftp://libvirt.org/libvirt/   (http://koji.fedoraproject.org/koji/taskinfo?taskID=3363547)
######################
[root@moon ~]# cat /etc/redhat-release 
Fedora release 15 (Lovelock)
######################
[root@moon ~]# rpm -q libvirt libvirt-client
libvirt-0.9.5-1.fc15.x86_64
libvirt-client-0.9.5-1.fc15.x86_64
[root@moon ~]# 
######################

How reproducible:
All the time

Steps to Reproduce:
1. create a new fedora-15 minimal guest (only @core)
2. stop the guest using virsh

#####################
[root@moon ~]# virsh list 
 Id Name                 State
----------------------------------
  5 f15foo10             running

[root@moon ~]# 
#####################
[root@moon ~]# virsh shutdown f15foo10
#####################
[root@moon ~]# virsh list 
 Id Name                 State
----------------------------------
  5 f15foo10             paused

[root@moon ~]# 
#####################  


Actual results:
Guest shutdown result is 'paused' instead of shutdown

Expected results:
Guest should shutdown should happen successfully.  

Additional info:
Attached is the output of 'virsh shutdown f15foo10' with  LIBVIRT_DEBUG=1

--- Additional comment from kchamart on 2011-09-20 07:03:57 EDT ---

Created attachment 524000 [details]
virsh shutdown fooguest with  LIBVIRT_DEBUG=1

--- Additional comment from jdenemar on 2011-09-20 08:15:28 EDT ---

This is most likely a qemu bug. What does "virsh domstate --reason f15foo10" say when you see the guest is paused? What is the version of installed qemu package?

--- Additional comment from kchamart on 2011-09-20 08:25:32 EDT ---

Jiri,

Domain state and qemu-kvm version here:

###################
[root@moon ~]# virsh domstate --reason f15foo10
paused (shutting down)
###################
[root@moon ~]# rpm -q  qemu-kvm
qemu-kvm-0.15.0-4.fc15.x86_64
[root@moon ~]# 
###################

--- Additional comment from jdenemar on 2011-09-20 08:52:52 EDT ---

Thanks, the qemu-kvm package is most likely missing this patch: 

http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg01757.html

--- Additional comment from berrange on 2011-09-20 09:01:41 EDT ---

Hmm, this is not very good, because it means virsh shutdown is now broken for every released version of QEMU in existence surely ?  I don't think we can go with such a widespread regression, even though the bug is technically in QEMU, we ought to avoid it in libvirt.

--- Additional comment from jdenemar on 2011-09-20 09:28:15 EDT ---

Not all QEMU versions. The bug was introduced sometime around 0.14.0-rc. Earlier versions should work fine I believe. We could avoid it by not using -no-shutdown for QEMU 0.14 and 0.15 (I don't see a better way of detecting if the bug is there or not).

--- Additional comment from veillard on 2011-09-20 09:48:14 EDT ---

Confirmed, on F14 with qemu-kvm-0.13.0-1.fc14.x86_64, virsh shutdown
seems to works normally:

[root@paphio ~]# virsh list 
 Id Name                 State
----------------------------------
  1 RHEL-5.4-64          running

[root@paphio ~]# virsh shutdown RHEL-5.4-64
Domain RHEL-5.4-64 is being shutdown

[root@paphio ~]# rpm -q qemu-kvm
qemu-kvm-0.13.0-1.fc14.x86_64
[root@paphio ~]# virsh list 
 Id Name                 State
----------------------------------

[root@paphio ~]# virsh list  --all
 Id Name                 State
----------------------------------
  - RHEL-5.4-64          shut off

Daniel

--- Additional comment from jdenemar on 2011-09-21 16:45:23 EDT ---

This issue should be fixed upstream by:

commit f84aedad090da1e05ccc5651815febba013eb3ad
Author: Jiri Denemark <jdenemar>
Date:   Wed Sep 21 10:25:29 2011 +0200

    qemu: Fix shutdown regression with buggy qemu
    
    The commit that prevents disk corruption on domain shutdown
    (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
    0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
    only recently in QEMU git. The affected versions of QEMU do not quit on
    SIGTERM if started with -no-shutdown, which we use to implement fake
    reboot. Since -no-shutdown tells QEMU not to quit automatically on guest
    shutdown, domains started using the affected QEMU cannot be shutdown
    properly and stay in a paused state.
    
    This patch disables fake reboot feature on such QEMU by not using
    -no-shutdown, which makes shutdown work as expected. However,
    virDomainReboot will not work in this case and it will report "Requested
    operation is not valid: Reboot is not supported with this QEMU binary".

--- Additional comment from jdenemar on 2011-09-22 05:10:30 EDT ---

This patch is now included in libvirt-0.9.6 release.

--- Additional comment from kchamart on 2011-09-22 07:34:47 EDT ---

Thanks. VERIFIED

I made a quick scratch build(for F15) using libvirt F14 SRPM  http://koji.fedoraproject.org/koji/taskinfo?taskID=3369923

And shutting down the guest is now graceful.

##########################################################################
[root@moon ~]# rpm -q libvirt
libvirt-0.9.6-1.fc15.x86_64
##########################################################################
[root@moon ~]# virsh list
 Id Name                 State
----------------------------------
  7 f15foo10             running
##########################################################################
[root@moon ~]# virsh shutdown f15foo10 ; virsh console f15foo10
Domain f15foo10 is being shutdown

Connected to domain f15foo10
Escape character is ^]
Killing mdmonitor: Stopping PC/SC smart card daemon (pcscd): Shutting down sm-client: [  143.322077] systemd[1]: var-lib-nfs-rpc_pipefs.mount mount process exited, code=exited status=1
[  OK  ]
Stopping sshd: [  143.475185] smartd[621]: smartd received signal 15: Terminated
[  143.475202] smartd[621]: smartd is exiting (exit status 0)
[  143.475213] acpid[644]: exiting
[  143.475233] modem-manager[672]: <info> Caught signal 15, shutting down...
[  143.475245] NetworkManager[600]: <warn> disconnected by the system bus.
[  143.475255] NetworkManager[600]: <info> caught signal 15, shutting down normally.
[  143.475273] NetworkManager[600]: <warn> quit request received, terminating...
[  143.475286] NetworkManager[600]: <info> exiting (success)
[  143.544614] sshd[797]: Received signal 15; terminating.
Stopping RPC idmapd: [  143.582970] systemd[1]: sshd.service: main process exited, code=exited, status=255
Stopping NFS statd: Shutting down sendmail: [  143.599799] rpc.statd[825]: Caught signal 15, un-registering and exiting
[  143.604281] systemd[1]: nfslock.service: main process exited, code=exited, status=1
[  OK  ]
[  OK  ]
[  143.665284] systemd[1]: Unit sshd.service entered failed state.
[  OK  ]


[  143.737185] systemd[1]: Unit nfslock.service entered failed state.
[  144.297288] systemd[1]: pcscd.service: main process exited, code=exited, status=1
[  OK  ]
[  144.461158] systemd[1]: Unit pcscd.service entered failed state.
Stopping rpcbind: [  144.720246] rpcbind[720]: rpcbind terminating on signal. Restart with "rpcbind -w"
[  144.724478] systemd[1]: rpcbind.service: main process exited, code=exited, status=2
[  OK  ]
[  144.818168] systemd[1]: Unit rpcbind.service entered failed state.
Stopping auditd: [  144.871352] type=1305 audit(1316690292.412:97): audit_pid=0 old=683 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[  144.873205] auditd[683]: The audit daemon is exiting.
[  OK  ]
[  144.980974] type=1305 audit(1316690292.521:98): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditctl_t:s0 res=1
Not stopping monitoring, this is a dangerous operation. Please use force-stop to override.
[  145.095679] systemd[1]: lvm2-monitor.service: control process exited, code=exited status=1
[  145.112677] systemd[1]: Unit lvm2-monitor.service entered failed state.
iptables: Flushing firewall rules: ip6tables: Flushing firewall rules: [  OK  ]
[  OK  ]
iptables: Setting chains to policy ACCEPT: filter ip6tables: Setting chains to policy ACCEPT: filter [  OK  ]
iptables: Unloading modules: [  OK  ]
ip6tables: Unloading modules: [  OK  ]
[  OK  ]
[  146.994142] systemd[1]: Shutting down.
Sending SIGTERM to remaining processes...
Sending SIGKILL to remaining processes...
Unmounting file systems.
[  147.337958] EXT4-fs (dm-1): re-mounted. Opts: (null)
Disabling swaps.
Detaching loop devices.
Detaching DM devices.
Not all DM devices detached, 1 left.
Detaching DM devices.
Not all DM devices detached, 1 left.
Cannot finalize remaining file systems and de[  147.361770] md: stopping all md devices.
vices, trying to kill remaining processes.
Detaching DM devices.
Not all DM devices detached, 1 left.
Cannot finalize remaining file systems and devices, giving up.
[  148.364613] ACPI: Preparing to enter system sleep state S5
[  148.366414] Disabling non-boot CPUs ...
[  148.369465] Unregister pv shared memory for cpu 1
[  148.371393] Broke affinity for irq 4
[  148.372091] Broke affinity for irq 43
[  148.428899] Power down.
##########################################################################
[root@moon ~]# virsh list
 Id Name                 State
----------------------------------

[root@moon ~]# 
##########################################################################

Comment 1 Richard W.M. Jones 2011-10-20 08:45:10 UTC

qemu version is:
qemu-kvm-0.12.1.2-2.192.el6.x86_64

Notice that I just updated the host from RHEL 6.1 to 6.2 beta,
and I have not rebooted the guests or the host since the update.

Comment 3 Jiri Denemark 2011-10-20 09:31:39 UTC

The qemu version you are mentioning is the installed one, not the one which old guests are running on, right? If so, what qemu version is the one your guests use? We thought the bug in qemu only affected 6.2 development packages and not any package we already shipped, i.e., 6.1(.z).

Comment 4 Richard W.M. Jones 2011-10-20 09:46:58 UTC

First thing: the problem does NOT occur now that I have
rebooted the host and all guests are using the latest qemu.

When the problem did occur, the guests were running another
version of qemu.  However I do not know which precise version
that was, since those guests could have been running for a
long time.

Here are the versions of qemu that have been installed on
this host (note that at some point this was pulling packages
from brew, so some of these are not RHEL released versions):

Apr 13 16:20:41 Updated: 2:qemu-kvm-0.12.1.2-2.158.el6.x86_64
May 13 20:38:58 Updated: 2:qemu-kvm-0.12.1.2-2.160.el6.x86_64
Aug 18 15:08:26 Updated: 2:qemu-kvm-0.12.1.2-2.183.el6.x86_64
Sep 22 10:57:29 Updated: 2:qemu-kvm-0.12.1.2-2.190.el6.x86_64
Oct 19 23:06:12 Updated: 2:qemu-kvm-0.12.1.2-2.192.el6.x86_64

I then checked 'last' inside each guest, to find out when each
was rebooted, and thus what version of qemu it was likely to be
running:

builder_rhel6: Oct 17 => qemu 190
builder_rhel5: Oct 17 => qemu 190
builder_debian6: Oct 17 => qemu 190
builder_ubuntu1104: Sep 23 => qemu 190

So they were probably running the unreleased
2:qemu-kvm-0.12.1.2-2.190.el6.x86_64

Does this qemu contain the bug?

Comment 5 Richard W.M. Jones 2011-10-20 09:51:20 UTC

OK, the answer to my question is yes, everything
before qemu 192 contained the bug, according to the
qemu changelog:

* Tue Sep 20 2011 Michal Novotny <minovotn> - qemu-kvm-0.12.1.2-2.192.el6
[...]
- kvm-Fix-termination-by-signal-with-no-shutdown.patch [bz#738487]

Hence closing this bug now.

Comment 6 Eric Blake 2011-10-21 18:14:16 UTC

*** Bug 747708 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.