Bug 913613 - Some Instances are shutoff after they're suspended externally to nova
Some Instances are shutoff after they're suspended externally to nova
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
2.0 (Folsom)
Unspecified Unspecified
high Severity medium
: snapshot4
: 2.1
Assigned To: Pádraig Brady
Kashyap Chamarthy
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-21 10:51 EST by Pádraig Brady
Modified: 2016-01-04 09:46 EST (History)
6 users (show)

See Also:
Fixed In Version: openstack-nova-2012.2.3-2.el6ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 890512
Environment:
Last Closed: 2013-03-21 14:16:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1097806 None None None Never
OpenStack gerrit 20337 None None None Never

  None (edit)
Description Pádraig Brady 2013-02-21 10:51:50 EST
This is similar to bug #890512

To test, suspend the VM externally to nova and ensure that nova doesn't stop the VM after a while.
Comment 3 Kashyap Chamarthy 2013-03-20 02:09:32 EDT
1] Version info:
#-------------#
$ cat /etc/redhat-release ; arch
Red Hat Enterprise Linux Server release 6.4 (Santiago)
x86_64
#-------------#


== Verification info: ==

2] Ensure the fix is in:
#-------------#
$ rpm -q openstack-nova --changelog | grep 913613
- Fix state sync logic related to the PAUSED VM state #913613
#-------------#

2.1] Check the patche referenced in Comment #1
#-------------#
$ grep -i "Instance is paused" /usr/lib/python2.6/site-packages/nova/compute/manager.py -B2
                    # the VM state will go back to running after the external
                    # instrumentation is done. See bug 1097806 for details.
                    LOG.warn(_("Instance is paused unexpectedly. Ignore."),

#-------------#


3] Suspend a running guest externally using "virsh"

======
Notes: 

   - first run $ nova list
   - pick an instance
   - grep -i 639b3bf0-cb97-466c-9f8e-3cf369077e1f /etc/libvirt/qemu/*  
     - so that you get the instance id used by virsh 
======
#-------------#
$ nova list | grep -i f16-t4
| 639b3bf0-cb97-466c-9f8e-3cf369077e1f | f16-t4        | ACTIVE | net1=ww.xx.yy.zz |
#-------------#
[root@interceptor ~(keystone_user1)]# grep -i 639b3bf0-cb97-466c-9f8e-3cf369077e1f /etc/libvirt/qemu/*
/etc/libvirt/qemu/instance-00000044.xml:  <uuid>639b3bf0-cb97-466c-9f8e-3cf369077e1f</uuid>
/etc/libvirt/qemu/instance-00000044.xml:      <entry name='uuid'>639b3bf0-cb97-466c-9f8e-3cf369077e1f</entry>
#-------------#
$ virsh suspend instance-00000044
#-------------#



4] Observe nova compute log -- /var/log/nova/compute.log
#-------------#
.
.
2013-03-20 11:27:39 2751 WARNING nova.compute.manager [-] Found 7 in the database and 8 on the hypervisor.
2013-03-20 11:27:40 2751 WARNING nova.compute.manager [-] [instance: 639b3bf0-cb97-466c-9f8e-3cf369077e1f] Instance is paused unexpectedly. Ignore.
2013-03-20 11:27:42 2751 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 54168
2013-03-20 11:27:42 2751 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 36
2013-03-20 11:27:42 2751 AUDIT nova.compute.resource_tracker [-] Free VCPUS: 41
2013-03-20 11:27:42 2751 INFO nova.compute.resource_tracker [-] Compute_service record updated for interceptor.lab.eng.pnq.redhat.com 
2013-03-20 11:27:45 2751 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 54168
2013-03-20 11:27:45 2751 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 36
.
.
#-------------#

You can see the message from the patch --- "Instance is paused unexpectedly. Ignore."
Comment 4 Kashyap Chamarthy 2013-03-20 02:13:54 EDT
Now, again, list the instance with nova, it still reports as 'ACTIVE'
#-------------#
$ nova list | grep f16-t4
| 639b3bf0-cb97-466c-9f8e-3cf369077e1f | f16-t4        | ACTIVE | net1=ww.xx.yy.zz |
#-------------#

However:
#-------------#
$ ssh -i oskey3.priv root@ww.xx.yy.zz
ssh: connect to host ww.xx.yy.zz port 22: No route to host
#-------------#


So, is this consistently reporting ? The 'ACTIVE' state above sounds dubious.



From the referenced patch, a fragment of the relevant 'elif' control flow statement:
#----------------#
.
.
.
                elif vm_power_state == power_state.PAUSED:
                    # Note(maoy): a VM may get into the paused state not only
                    # because the user request via API calls, but also
                    # due to (temporary) external instrumentations.
                    # Before the virt layer can reliably report the reason,
                    # we simply ignore the state discrepancy. In many cases,
                    # the VM state will go back to running after the external
                    # instrumentation is done. See bug 1097806 for details.
                    LOG.warn(_("Instance is paused unexpectedly. Ignore."),
#----------------#
Comment 5 Pádraig Brady 2013-03-20 07:57:22 EDT
This is OK, as from Nova's point of view it's active.
Are there warnings in the logs that the instance is paused?
If so it's good to go as per upstream arguments at least.
Comment 6 Kashyap Chamarthy 2013-03-20 08:17:10 EDT
Yes, there is a warning as noted in Comment #4 :

 "Instance is paused unexpectedly. Ignore."

which in the commit referenced.

Relevant log fragment:
#-------------#
.

2013-03-20 11:27:40 2751 WARNING nova.compute.manager [-] [instance: 639b3bf0-cb97-466c-9f8e-3cf369077e1f] Instance is paused unexpectedly. Ignore.
.
.
#-------------#


Conlusion: Turning the bug to VERIFIED, per above comment as the fix is effective, and is demonstrated in the nova compute log file.
Comment 8 errata-xmlrpc 2013-03-21 14:16:41 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0657.html

Note You need to log in before you can comment on or make changes to this bug.