| Summary: | nova-network gets stuck on a stale lock file, failure scenario is non-obvious | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Cole Robinson <crobinso> | ||||
| Component: | openstack-nova | Assignee: | Mark McLoughlin <markmc> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 17 | CC: | akscram, alexander.sakhnov, asalkeld, bfilippov, breu, Jan.van.Eldik, jonathansteffan, markmc, matt_domsch, mlvov, p, rbryant, rkukura | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-09-27 09:19:49 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
So the root cause I think is that I had a stale nova-network iptables lock file, as someone else mentioned recently on the public list: http://www.mail-archive.com/openstack@lists.launchpad.net/msg09986.html This situation shouldn't happen understandably, so I guess we can use this bug to track that. *** Bug 812661 has been marked as a duplicate of this bug. *** Looks similar: https://bugs.launchpad.net/nova/+bug/953924 As does part of: https://bugs.launchpad.net/nova/+bug/1008906 Note Folsom (in Fedora 18) has revamped the lock file mechanism to be more robust. Also there have been 2 improvements since to the essex stale lock cleanup code: https://github.com/openstack/nova/commit/f2bc403 https://github.com/openstack/nova/commit/1076699 So I'll mark this done |
Created attachment 577590 [details] compute log snippet Okay, I'm not sure if this is just me doing something wrong, but the f17 'getting started' docs aren't letting me launch an instance. This is a reasonably fresh f17 host. Everything seems to go fine up until I try and launch the instance like: nova boot myserver2 --flavor 2 --key_name mykey --image $(glance index | grep f16-jeos | awk '{print $1}') It seems to kick off fine, sits in BUILD status for a minute, then falls into ERROR status. The compute.log indicates that network setup timed out. Annoyingly though there isn't a peep in the network.log. And extra annoyingly, 'nova delete' claims to work but the dead VMs don't go anywhere :( Is 'nova-manage network create' supposed to immediately create the bridge device? It isn't doing so in my case, but it could just queue changes until actually needed.