Red Hat Bugzilla – Bug 1468631
openvswitch segfaults when changing port VIF MTU and there's traffic flowing
Last modified: 2017-08-21 13:03:14 EDT
Description of problem:
It's easy to get openvswitch to segfault when changing the MTU of a vif port.
A customer of ours uses OVS with OSP10z3 and DPDK with Jumbo frames.
As VIF port do not currently inherit the MTU from the OVS-Bridge, the customer must run a cronjob to set the MTU on 'vhu*' ports when they come up.
This results in ovs-vswitchd segfaulting very often:
E.g: ovs-vsctl set interface vhu2fd7027c-33 mtu_request=9000
Jul 05 12:52:46 tkll00p1 kernel: pmd459: segfault at 44 ip 00007fa334156dff sp 00007fa2517ef4d0 error 4 in ovs-vswitchd[7fa33407b000+3b1000]
Jul 05 12:52:46 tkll00p1 systemd: ovs-vswitchd.service: main process exited, code=killed, status=11/SEGV
In ovsdb-server.log we see a line such a this one:
2017-07-05T17:52:47.975Z|00005|fatal_signal|WARN|terminating with signal 15 (Terminated)
Version-Release number of selected component (if applicable):
From the field it seems there's about 50% chance of it segfaulting when setting MTU.
Steps to Reproduce:
most likely this needs 546e57d44c473aac2915037f6906c9dd04294105
Will check to see if that is all.
Is it also possible to get a crash dump from the customer? That would confirm this is the issue.
I've posted an upstream fix for the crash reported:
How fast are we going to have that downstream? Either as a hotfix or in the repos?
Needs to be accepted upstream first - I don't know how long that will take, usually a few days to a week.
*** Bug 1477785 has been marked as a duplicate of this bug. ***
The fix was applied in upstream repository. Please build a new package with it included.
Brew build https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13784598
TAM and SA team met with Cisco 8/8. Cisco is open to providing us with the necessary hardware, or access to a lab with the hardware, or helping us determine if hardware we have is functionally equivalent.
Ravi Anan (firstname.lastname@example.org) is the Cisco engineer we need to contact about test environment.
Here are the notes from our meeting:
Current RH Software on Sprint Environment:
--RH OSP 10.z2
--OVS 2.6.1-3 beta (Compiled with DPDK)
Current OVS Deployment: Open vSwitch version number does not necessarily imply what version of DPDK library the upstream used to compile it. Cisco does track what version of DPDK library works with VIC-1340. We don't know if OVS 2.6.1-3 uses a compatible DPDK library.
Action Item: Cisco to confirm what DPDK libraries work (tested?) with VIC-1340, map DPDK library to OVS version (or check upstream opevswitch.org?). Cisco PMD drivers also upstreamed to openvswitch.org. Need to confirm which versions of OVS contain the correct DPDK library and correct PMD drivers. Need to confirm that we're using a stable branch of DPDK libraries that will accumulate support patches going forward.
Current RH QE Test Lab: RH may lack the correct hardware in their lab to test OVS hotfixes.
Action Item: RH engineering will contact Ravi Anan to determine if our lab hardware is either the same as or compatible with UCS-B200 and VIC-1340 (Jim Sisul will pass contact info to RH QE) and if not, how to go about getting it for testing purposes.
Moving back to ASSIGNED.
Due to lack of HW to verify and enable support, the ENIC PMD driver will be disabled for 10z4.