Bug 891422 - Compute Node should check for VM state inconsistencies on service startup vs. waiting for 10 minutes
Summary: Compute Node should check for VM state inconsistencies on service startup vs....
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 2.0 (Folsom)
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: snapshot2
: 2.1
Assignee: Nikola Dipanov
QA Contact: Yaniv Kaul
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-02 21:50 UTC by Perry Myers
Modified: 2019-09-09 17:03 UTC (History)
3 users (show)

Fixed In Version: openstack-nova-2012.2.2-9.el6ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-14 18:24:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0260 0 normal SHIPPED_LIVE Red Hat OpenStack 2.0 (Folsom) Preview bug fix and enhancement update 2013-02-14 23:21:02 UTC

Description Perry Myers 2013-01-02 21:50:55 UTC
Description of problem:
computenode has a periodic task that refreshes the state in the database for any VMs that don't match the recorded state.  It looks like by default this task runs every 10 minutes.  But it would be better on service start to do this task immediately vs. waiting for 10 minutes to refresh the state.

Version-Release number of selected component (if applicable):
openstack-nova-compute-2012.2.1-2.el6ost.noarch

How reproducible:
Every time

Steps to Reproduce:
1. Start up an instance on a compute node
2. Shut down the compute node (which will kill both the CN service as well as the VM)
3. Start up the compute node
  
Actual results:
The compute node will start up, but periodic task that refreshes the database to sync the VM states takes 10 minutes to run.

Expected results:
State should be refreshed immediately

Additional info:
2013-01-02 16:33:50 16247 INFO nova.openstack.common.rpc.impl_qpid [-] Connected to AMQP server on 192.168.15.2:5672
2013-01-02 16:33:50 INFO nova.compute.manager [req-d954dd04-4831-4ed8-ba2e-7617b0c752e9 None None] Updating host status

...

2013-01-02 16:43:53 16247 WARNING nova.compute.manager [-] Found 3 in the database and 1 on the hypervisor.
2013-01-02 16:43:53 16247 WARNING nova.compute.manager [-] [instance: 21879e7d-9a44-48aa-b507-ea88690860bb] Instance shutdown by itself. Calling the stop API.
2013-01-02 16:43:54 16247 INFO nova.virt.libvirt.driver [-] [instance: 21879e7d-9a44-48aa-b507-ea88690860bb] Instance destroyed successfully.

Comment 1 Russell Bryant 2013-01-15 15:53:31 UTC
There is an option called "resume_guests_state_on_host_boot" that will change the behavior of this use case (see init_host() in nova/compute/manager.py).  When enabled, it would have automatically restarted the instances that were supposed to be running when nova-compute starts up.  Personally that is what I would expect to happen by default.  We could consider changing this to be on by default in RHOS, I suppose.

There are cases covered by the sync_power_states periodic task that are not covered by init_host().  It seems like those two methods need a bit of refactoring so that init_host can sync the power state of every instance as it traverses them.

However, it does seem like doing sync_power_states() in init_host() would be good, too.

Comment 2 Russell Bryant 2013-01-15 15:57:00 UTC
Pretend the last sentence in my last comment isn't there ...

Comment 6 Yaniv Kaul 2013-01-29 18:43:15 UTC
Note to QE: test both service shutdown and restart as well as host cycle work with that parameter:
1. That VMs re-run on that host.
2. That if you opt to run that VM on a different host, they do NOT run.

Comment 8 Yaniv Kaul 2013-02-10 15:25:33 UTC
Verified on:
[root@cougar10 ~]# rpm -qa |grep nova
python-novaclient-2.10.0-2.el6ost.noarch
python-nova-2012.2.2-9.el6ost.noarch
openstack-nova-network-2012.2.2-9.el6ost.noarch
openstack-nova-common-2012.2.2-9.el6ost.noarch
openstack-nova-compute-2012.2.2-9.el6ost.noarch

Comment 10 errata-xmlrpc 2013-02-14 18:24:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0260.html


Note You need to log in before you can comment on or make changes to this bug.