Created attachment 988535 [details] Nova compute log and messages Description of problem: On an HA deployment one of my compute nodes's compute services failed, restarting service doesn't help failed again Attached nova and messages log. Version-Release number of selected component (if applicable): rhel7 python-nova-2014.1.3-9.el7ost.noarch openstack-nova-compute-2014.1.3-9.el7ost.noarch openstack-nova-common-2014.1.3-9.el7ost.noarch python-novaclient-2.17.0-2.el7ost.noarch openstack-nova-novncproxy-2014.1.3-9.el7ost.noarch How reproducible: Unsure first time I see this Steps to Reproduce: 1. Service failed, restarring it then failed again. 2. 3. Actual results: Nova compute service failed Expected results: Additional info: Adding compute.log and messages log.
Looking at your error message (one of the most common in Nova): 2015-02-05 16:52:30.557 8107 TRACE nova.openstack.common.threadgroup NovaException: Unexpected vif_type=binding_failed On a quick look ug looks like a duplicate of (which was closed as an environment issue) https://bugzilla.redhat.com/show_bug.cgi?id=1183253 -- Nova boot failed in _build_and_run_instance --- Unexpected vif_type=binding_failed It (your bug #1189836) most likely is not a "bug". Why? - First, the error means Neutron and Nova could not communicate for any number of reasons (quoting from the bug I linked, 1183253): - neutron (-server) was unable to find an OVS agent with the appropriate hostname - Do you have a running openvswitch-agent log? - Mis-configured agent, or if you've fiddled OVS in a way that was unexpected So, on the above basis, I'm tempted to close your bug as a duplicate (of 118253) unless you can consistently reproduce this issue. Can you try to restart all services in a systematic manner (using `openstack-service`) all services and see if you can *still* reproduce? A gentle note: Next time, please add actual error log messages/tracebacks in the bug instead of only tar.gz files.
With help of Neutron QE guys Itzikb & Oblaut, we are able to resolve issue. Not a duplicate bug as in your case we are talking about a single instance being effected here we are talking about the while nova-compute service failing. For details see LP, we were able to reproduce issue on another deployment: https://bugs.launchpad.net/nova/+bug/1419452
The upstream fix for: https://bugs.launchpad.net/nova/+bug/1324041 is already in the 2014.1.4 stable release, so will be pulled via the rebase BZ 1199106.
Picked up in 2014.1.4 rebase.
Did the steps as written in https://bugs.launchpad.net/nova/+bug/1419452 =============== 1. Launch an instance. 2. Stop openvswitch agent on the compute node 3. Attach another interface to the instance using a second network # nova interface-attach --net-id <net-id> <server> 4. Restart nova-compute vm came up with the a new interface connected to the other network
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0843.html