Description of problem:
When using the LibvirtGenericVIFDriver driver causes multicast boadcasts to be dropped after about 200 seconds.
[root@gss-rhos-4 ~]# echo 0 > /sys/devices/virtual/net/tap751c39bc-db/brport/bridge/bridge/multicast_snooping
[root@gss-rhos-4 ~]# echo 0 > /sys/devices/virtual/net/tap18b1e1ef-5a/brport/bridge/bridge/multicast_snooping
Version-Release number of selected component (if applicable):
Steps to Reproduce:
On node1 and node2 which are a multicast receivers we execute the command: iperf -s -u -B 188.8.131.52 -i 1
On node3 which is the multicast sender we execute the command: iperf -c 184.108.40.206 -u --ttl 5 -t 3600
On the node3 we see the following output:
Client connecting to 220.127.116.11, UDP port 5001
Sending 1470 byte datagrams
Setting multicast TTL to 5
UDP buffer size: 208 KByte (default)
[ 3] local 192.168.11.8 port 35976 connected with 18.104.22.168 port 5001
On node1 and node2 we see that the nodes receive multicast traffic:
[ 3] 197.0-198.0 sec 128 KBytes 1.05 Mbits/sec 0.036 ms 0/ 89 (0%)
[ 3] 198.0-199.0 sec 129 KBytes 1.06 Mbits/sec 0.046 ms 0/ 90 (0%)
[ 3] 199.0-200.0 sec 128 KBytes 1.05 Mbits/sec 0.041 ms 0/ 89 (0%)
[ 3] 200.0-201.0 sec 128 KBytes 1.05 Mbits/sec 0.043 ms 0/ 89 (0%)
[ 3] 958.0-959.0 sec 128 KBytes 1.05 Mbits/sec 0.015 ms 0/ 89 (0%)
[ 3] 959.0-960.0 sec 128 KBytes 1.05 Mbits/sec 0.019 ms 0/ 89 (0%)
[ 3] 960.0-961.0 sec 128 KBytes 1.05 Mbits/sec 0.012 ms 0/ 89 (0%)
[ 3] 961.0-962.0 sec 129 KBytes 1.06 Mbits/sec 0.015 ms 0/ 90 (0%)
[ 3] 962.0-963.0 sec 128 KBytes 1.05 Mbits/sec 0.027 ms 0/ 89 (0%)
[ 3] 963.0-964.0 sec 128 KBytes 1.05 Mbits/sec 0.025 ms 0/ 89 (0%)
Using the libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtGenericVIFDriver it does not support Security groups for them
Is there an error in the description of this BZ? The BZ title indicates the generic VIF driver, but the steps to reproduce indicates configuring the HybridOVSBridgeDriver.
Please see above comment for reason for NEEDINFO.
Clearing NEEDINFO. The hybrid driver does appear to have been configured as would have been required for security group support for the release indicated.
This appears to be the same issue as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=902922.
Brent, I update the title, but the issue is when the customer uses the
LibvirtHybridOVSBridgeDriver driver, which supports Security groups, it also has multicast_snooping enabled on the brport within the tapdevice
Considering the similarity to the bz mentioned above, I'd say this is a source of the problem and is not necessarily OpenStack specific. In order to properly address this, we need to:
- Determine whether it is expected to have to disable multicast snooping or not when doing this kind of thing. If it should not be necessary then it looks like a bug with linux bridging or similar and we should fix it there.
- If it is not a bug with linux bridging and it is expected that spoofing be disabled then we need to determine whether this is something libvirt should do when constructing bridges for the VMs, etc. If so, the issue should either be reassigned to libvirt or associated with other similar bugs already reported against libvirt.
- Regardless of either of the above, there probably should be some discussion if this is something that is appropriate to somehow workaround within OpenStack.
Issue is fixed in RHOS5 as it no longer requires the use of the OVSHybridDriver, At this point finding a workaround that would allow then to allow multicast_snooping for an entire host would be useful.
The HybridOVSBridgeDriver was obsoleted but the functionality was rolled up into the generic driver. Are you inferring that linux bridges are therefore no longer used, rendering this issue in 5 moot? Linux bridges are actually still created to implement security groups so if this works in 5 it is for some other reason.
Multicast reliability seems to have been related to kernel versions (e.g. https://bugzilla.redhat.com/show_bug.cgi?id=880035) so maybe there is a kernel fix underway already.
The issue we have is customer currently has to use the OVSHybrid driver becuase of the security groups not working otherwise. However when they use the OVSHybrid driver they lose the ability to use multicast.
this is a moot point becuase they when the upgrade to 5, then both the security groups and multicast work with the generic driver, however they are looking for a workaround or a fix for 4, as that is what they are currently on.
Upgrading kernel to 2.6.32-431.23.3.el6.x86_64 solved the issue. Pushing update to customer.
I'm closing this report as the root cause of the bug is a kernel issue.