Bug 812677

Summary: nova-network gets stuck on a stale lock file, failure scenario is non-obvious
Product: [Fedora] Fedora Reporter: Cole Robinson <crobinso>
Component: openstack-novaAssignee: Mark McLoughlin <markmc>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: akscram, alexander.sakhnov, asalkeld, bfilippov, breu, Jan.van.Eldik, jonathansteffan, markmc, matt_domsch, mlvov, p, rbryant, rkukura
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-27 09:19:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
compute log snippet none

Description Cole Robinson 2012-04-15 22:01:25 UTC
Created attachment 577590 [details]
compute log snippet

Okay, I'm not sure if this is just me doing something wrong, but the f17 'getting started' docs aren't letting me launch an instance. This is a reasonably fresh f17 host. Everything seems to go fine up until I try and launch the instance like:

nova boot myserver2 --flavor 2 --key_name mykey --image $(glance index | grep f16-jeos | awk '{print $1}')

It seems to kick off fine, sits in BUILD status for a minute, then falls into ERROR status.

The compute.log indicates that network setup timed out. Annoyingly though there isn't a peep in the network.log. And extra annoyingly, 'nova delete' claims to work but the dead VMs don't go anywhere :(

Is 'nova-manage network create' supposed to immediately create the bridge device? It isn't doing so in my case, but it could just queue changes until actually needed.

Comment 1 Cole Robinson 2012-04-16 15:38:55 UTC
So the root cause I think is that I had a stale nova-network iptables lock file, as someone else mentioned recently on the public list:

http://www.mail-archive.com/openstack@lists.launchpad.net/msg09986.html

This situation shouldn't happen understandably, so I guess we can use this bug to track that.

Comment 2 Cole Robinson 2012-04-18 14:26:43 UTC
*** Bug 812661 has been marked as a duplicate of this bug. ***

Comment 3 Mark McLoughlin 2012-06-07 12:31:34 UTC
Looks similar: https://bugs.launchpad.net/nova/+bug/953924

Comment 4 Pádraig Brady 2012-06-07 13:13:36 UTC
As does part of: https://bugs.launchpad.net/nova/+bug/1008906

Comment 5 Pádraig Brady 2012-09-27 09:19:49 UTC
Note Folsom (in Fedora 18) has revamped the lock file mechanism to be more robust. Also there have been 2 improvements since to the essex stale lock cleanup code:

https://github.com/openstack/nova/commit/f2bc403
https://github.com/openstack/nova/commit/1076699

So I'll mark this done