Bug 858276

Summary: start of default libvirt network and bridge device, virbr0, causes failure of nova network
Product: Red Hat OpenStack Reporter: Dan Yocum <dyocum>
Component: doc-Getting_Started_GuideAssignee: RHOS Documentation Team <rhos-docs>
Status: CLOSED DUPLICATE QA Contact: ecs-bugs
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.0 (Essex)CC: breeler, ndipanov, sgordon
Target Milestone: ---Keywords: Documentation
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-16 16:47:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Yocum 2012-09-18 14:15:04 UTC
Description of problem:

Starting the default libvirt network and bridge device, virbr0, causes nova networking to fail in strange ways on RHEL 6.3.

Version-Release number of selected component (if applicable):

Essex

How reproducible:

Always, but in random ways...

Steps to Reproduce:
1. Start libvirt default network either before or after openstack-nova-network
2. Start a number of VMs on all compute nodes with auto-assign of floating IPs 
3. Attempt to ping or ssh into all VMs.  Some will fail, some won't.  
4. Check routing tables on all compute nodes, if 192.168.122.1 is listed, 'ifconfig virbr0 down' and wait a while.
  
Actual results:

ping and ssh will intermittently succeed and fail to VMs on random compute nodes

Expected results:

ping and ssh should always succeed to all VMs

Additional info:

There is a similar bug in Fedora and libvirt - needs to be addressed in RHEL, too:

https://bugzilla.redhat.com/show_bug.cgi?id=802475

and

http://libvirt.org/git/?p=libvirt.git;a=commit;h=a83fe2c23efad190a1e00e448f607fe032650fd6

Comment 2 Nikola Dipanov 2013-01-04 13:27:35 UTC
It seems that running both nova-networking/quantum and libvirt networking on the hypervisor node is causing issues. 

I am not sure we can do anything about this other than warn users that this will cause issues, so I will move this bug to docs.

Comment 3 Dan Yocum 2013-01-04 14:42:56 UTC
Another contributing factor to this bug may be related to arp issues in a multi-host HA flatDHCP environment like ours.  The solution appears to be to set send_arp_for_ha=true in nova.conf for *ALL* HA networking environments, i.e., nova-network is running on all compute nodes.  

See this bug report for more details:

 https://bugs.launchpad.net/openstack-manuals/+bug/1093000

Comment 4 Stephen Gordon 2013-01-16 16:47:34 UTC

*** This bug has been marked as a duplicate of bug 888812 ***