Bug 1191922
| Summary: | ovs-agent restart or ovs restart can cause a network storm bringing down the net | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Miguel Angel Ajo <majopela> |
| Component: | openstack-neutron | Assignee: | Miguel Angel Ajo <majopela> |
| Status: | CLOSED ERRATA | QA Contact: | Nir Magnezi <nmagnezi> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 6.0 (Juno) | CC: | apevec, chrisw, fdinitto, fleitner, ihrachys, jbenc, lhh, lpeer, majopela, mlopes, nyechiel, oblaut, rhos-maint, rkhan, sclewis, scohen, yeylon |
| Target Milestone: | z1 | Keywords: | Regression, ZStream |
| Target Release: | 6.0 (Juno) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-neutron-2014.2.2-3.el7ost | Doc Type: | Bug Fix |
| Doc Text: |
Previously, the br-tun bridge was reset (OF rules and ports) during openvswitch-agent restarts, and in some conditions because of neutron-server restarts.
Consequently, if a broadcast packet entered br-tun while there were no openflow rules, and at least 2 other hosts br-tun had been reset the same way, the packet generated a network broadcast storm raising the network usage and the Open vSwitch cpu usage on all hosts.
This update fixes this issue by setting br-tun automatically into secure mode during reset. As a result, packets will not be forwarded in the absence of openflow rules, and the race condition has been eliminated.
|
Story Points: | --- |
| Clone Of: | 1185521 | Environment: | |
| Last Closed: | 2015-03-05 18:21:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1185521, 1186492, 1187257, 1191633 | ||
| Bug Blocks: | |||
|
Comment 6
lpeer
2015-02-16 12:22:58 UTC
Why MODIFIED? Moving back to ON_DEV. Verified NVR:
openstack-neutron-2014.2.2-3.el7ost.noarch
openvswitch-2.1.2-2.el7_0.2.x86_64
openstack-neutron-openvswitch-2014.2.2-3.el7ost.noarch
Verification Steps:
===================
1. Deploy[1] a setup with at least 3 nodes running openvswitch + vxlan tunneling[2].
2. For each node running openvswitch, stop neutron ovs agent & restart openvswitch:
# systemctl stop neutron-openvswitch-agent
# systemctl restart openvswitch
3. At this point, The br-tun flow table should look like:
# ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
4. For each node running openvswitch, activate br-tun:
# ip l s br-tun up
5. From one of the nodes (networker for example), Install python scapy (via yum or pip).
# pip install scapy
6. For each node running openvswitch,, Monitor br-tun:
# tcpdump -i br-tun -vvv
7. Than, Run this[3] script from the node you selected to generate broadcast:
# python scapy_script.py
Result:
=======
1. Captured from tr-tun interface via the Node used in step #7:
18:44:57.448606 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto Options (0), length 20)
169.254.192.1 > 224.0.0.18: ip 0
2. Nothing apears in br-tun interfaces in the rest of the nodes.
3. CPU levels remain normal.
[1] http://jenkins-hurricane.scl.lab.tlv.redhat.com:8080/job/rhel-osp6-rhel7.1-neutron-ml2-vxlan/
[2] https://github.com/nmagnezi/hurricane/blob/master/plugins/installer/packstack/templates/packstack-juno-neutron-ml2-vxlan.ini
[3] https://github.com/nmagnezi/scripts/blob/master/scapy_script.py
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0635.html |