Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1575016 - ovs-vswitchd mempool free race condition
ovs-vswitchd mempool free race condition
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch (Show other bugs)
7.6
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Kevin Traynor
qding
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-05-04 10:30 EDT by Kevin Traynor
Modified: 2018-06-29 05:55 EDT (History)
7 users (show)

See Also:
Fixed In Version: openvswitch-2.9.0-28.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-06-21 09:36:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1962 None None None 2018-06-21 09:37 EDT

  None (edit)
Description Kevin Traynor 2018-05-04 10:30:19 EDT
As reported in Bz1551682 it is possible that when a mempool is freed some of the mbufs from it are still in use. This can still be fine, but it is PMD dependent and racy. 

Change the mempool free logic to ensure that all mbufs have been returned to the mempool before freeing.
Comment 3 qding 2018-05-16 05:17:34 EDT
Hi Kevin,

I'm verifying the bug. Have you got steps to reproduce the issue? Or any idea on how to verify it?

Thanks,
QJ
Comment 4 Kevin Traynor 2018-05-16 14:54:13 EDT
Hi QC,

I couldn't reproduce the customer seg fault, so I think the best way we can validate is to check the logs and see that the new mempool freeing scheme is working as expected. 

1. Setup 2 phy ports in OVS-DPDK on the same NUMA node, with the same MTU, and run traffic between them

2. Turn on debug logging for netdev_dpdk
# ovs-appctl vlog/set netdev_dpdk:console:dbg
# ovs-appctl vlog/set netdev_dpdk:file:dbg

3.Check the name of mempool that is currently being used
# ovs-appctl netdev-dpdk/get-mempool-info dpdk0 | grep "mempool"
mempool <ovs_mp_2030_0_262144>@0x7f9c500f6b40
# ovs-appctl netdev-dpdk/get-mempool-info dpdk1 | grep "mempool"
mempool <ovs_mp_2030_0_262144>@0x7f9c500f6b40

4. Change MTU size of the ports to 5000
# ovs-vsctl -- set Interface dpdk0 mtu_request=5000
|netdev_dpdk|DBG|Allocated "ovs_mp_6126_0_262144"
# ovs-vsctl -- set Interface dpdk1 mtu_request=5000
|netdev_dpdk|DBG|Reusing mempool "ovs_mp_6126_0_262144"

7. At this point the ovs_mp_2030_0_262144 mempool is not associated with any of the ports but has not been freed, in order to allow more time for buffers to be returned to it. If you are running in a script, better to wait for a couple of seconds here.

8. Change MTU size of the ports to 9000
# ovs-vsctl -- set Interface myport mtu_request=9000
|netdev_dpdk|DBG|Freeing mempool "ovs_mp_2030_0_262144"
                 ^^^^^^^ 
|netdev_dpdk|DBG|Allocated "ovs_mp_9198_0_262144"

It is likely Freeing will happen here, but it is also possible due to the traffic pattern / driver etc. that there are still some in-use buffers and in that case it will take some further time and MTU changes before the mempool can be freed. That is perfectly fine too.

# ovs-vsctl -- set Interface urport mtu_request=9000
|netdev_dpdk|DBG|Reusing mempool "ovs_mp_9198_0_262144"

thanks,
Kevin
Comment 5 qding 2018-05-22 02:28:07 EDT
Failed to reproduce the issue in RH lab with either openvswitch-2.6.1-16.git20161206.el7ost.x86_64 or openvswitch-2.9.0-15.el7fdp.x86_64.rpm.

Verified with openvswitch-2.9.0-36.el7fdp.x86_64.rpm and the steps in Comment$4.
No segfault seen and there are logs:
|netdev_dpdk|DBG|Reusing mempool "ovs_mp_9198_1_65536"
|netdev_dpdk|DBG|Reusing mempool "ovs_mp_6126_1_32768"
|netdev_dpdk|DBG|Freeing mempool "ovs_mp_9198_1_65536"
Comment 7 errata-xmlrpc 2018-06-21 09:36:35 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1962
Comment 8 Kevin Traynor 2018-06-28 08:48:14 EDT
(In reply to qding from comment #5)

> Verified with openvswitch-2.9.0-36.el7fdp.x86_64.rpm and the steps in
> Comment$4.

Just to note, I think there is a typo and this should be the fdn package, not fdp. i.e. openvswitch-2.9.0-36.el7fdn.x86_64.rpm

There is no fdp package of that number in brew.
Comment 9 qding 2018-06-28 20:58:27 EDT
(In reply to Kevin Traynor from comment #8)
> (In reply to qding from comment #5)
> 
> > Verified with openvswitch-2.9.0-36.el7fdp.x86_64.rpm and the steps in
> > Comment$4.
> 
> Just to note, I think there is a typo and this should be the fdn package,
> not fdp. i.e. openvswitch-2.9.0-36.el7fdn.x86_64.rpm
> 
> There is no fdp package of that number in brew.

I do use openvswitch-2.9.0-36.el7fdp.x86_64.rpm. You can find it in http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch/2.9.0/36.el7fdp/

Note You need to log in before you can comment on or make changes to this bug.