Description of problem: Starting the default libvirt network and bridge device, virbr0, causes nova networking to fail in strange ways on RHEL 6.3. Version-Release number of selected component (if applicable): Essex How reproducible: Always, but in random ways... Steps to Reproduce: 1. Start libvirt default network either before or after openstack-nova-network 2. Start a number of VMs on all compute nodes with auto-assign of floating IPs 3. Attempt to ping or ssh into all VMs. Some will fail, some won't. 4. Check routing tables on all compute nodes, if 192.168.122.1 is listed, 'ifconfig virbr0 down' and wait a while. Actual results: ping and ssh will intermittently succeed and fail to VMs on random compute nodes Expected results: ping and ssh should always succeed to all VMs Additional info: There is a similar bug in Fedora and libvirt - needs to be addressed in RHEL, too: https://bugzilla.redhat.com/show_bug.cgi?id=802475 and http://libvirt.org/git/?p=libvirt.git;a=commit;h=a83fe2c23efad190a1e00e448f607fe032650fd6
It seems that running both nova-networking/quantum and libvirt networking on the hypervisor node is causing issues. I am not sure we can do anything about this other than warn users that this will cause issues, so I will move this bug to docs.
Another contributing factor to this bug may be related to arp issues in a multi-host HA flatDHCP environment like ours. The solution appears to be to set send_arp_for_ha=true in nova.conf for *ALL* HA networking environments, i.e., nova-network is running on all compute nodes. See this bug report for more details: https://bugs.launchpad.net/openstack-manuals/+bug/1093000
*** This bug has been marked as a duplicate of bug 888812 ***