Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1404567 - VxLAN setup with DPDK - Compute nodes needs to be restarted
VxLAN setup with DPDK - Compute nodes needs to be restarted
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo (Show other bugs)
10.0 (Newton)
Unspecified Unspecified
unspecified Severity urgent
: Upstream M3
: 11.0 (Ocata)
Assigned To: Jaganathan Palanisamy
Yariv
: TestOnly, Triaged
Depends On:
Blocks: 1406865 1413578
  Show dependency treegraph
 
Reported: 2016-12-14 01:50 EST by Karthik Sundaravel
Modified: 2017-05-17 15:51 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-05-17 15:51:03 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 19:01:50 EDT

  None (edit)
Description Karthik Sundaravel 2016-12-14 01:50:49 EST
nic configs for DPDK bridge in compute node :
            -
              type: ovs_user_bridge
              name: br-link
              use_dhcp: false
              addresses:
                -
                  ip_netmask: {get_param: TenantIpSubnet}
              members:
                -
                  type: ovs_dpdk_port
                  name: dpdk0
                  members:
                    -
                      type: interface
                      name: nic4

After deployment br-link in compute node is not up (loses the static IP). A reboot is required to workaround the issue.
Comment 1 Saravanan KR 2016-12-15 02:04:53 EST
While restarting openvswitch "systemctl restart openvswitch", (restart is required after setting the DPDK_OPTIONS), on the compute node, the ovs user bridge "br-link" is loosing the IP. We have captured the logs [2], where restart is issued at line #103. Can you please let us know if we are missing any configuration in the deployment? Let us know if you need access to the TripleO environment.

neutron/openvswitch-agent.log - http://pastebin.test.redhat.com/439103
openvswitch/ovs-vswitchd.log - http://pastebin.test.redhat.com/439102

[1] https://github.com/krsacme/tht-dpdk/blob/rhosp10k3/nic-configs/computeovsdpdk.yaml
[2] http://pastebin.test.redhat.com/439097
Comment 2 Saravanan KR 2016-12-15 02:05:53 EST
Aaron Conole's comments (in mail):

Did you set the datapath type correctly?  I didn't see so in the logs,
but you'll need to issue:

   ovs-vsctl set bridge br-link datapath_type=netdev

since it contains a port called dpdk0.  I didn't see anything else that
stood out as being wrong.  If the above doesn't correct it (and I
suggest restarting the ovs and address acquisition software once making
that change to be sure), can you capture an sosreport?
Comment 3 Sanjay Upadhyay 2016-12-15 02:35 EST
Created attachment 1231990 [details]
sosreport of the compute node

[root@overcloud-compute-0 ~]# ovs-vsctl list bridge | egrep "name|datapath_type"
datapath_type       : netdev
name                : br-ex
datapath_type       : netdev
name                : br-int
datapath_type       : netdev
name                : br-tun
datapath_type       : netdev
name                : br-link
Comment 4 Karthik Sundaravel 2017-01-06 02:21:11 EST
We find that the OVS bridge br-link loses the IP when openvswitch is restarted (after configuring the DPDK_OPTIONS).

As a workaround we either reboot the compute node or follow the below steps

1. ifup br-link
2. systemctl restart neutron-openvswitch-agent.
Comment 5 Saravanan KR 2017-01-10 00:24:04 EST
This issue is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1373085. Thanks Peng for pointing out. 

As per the discussion, if the openvswitch is restarted, then the service which is assigning IP to the ovs bridge also should be restarted, which in our case would be network.service.

We have manually restarted the network.service and found that the br-link gets the IP successfully. Now, we need to add this part of the deployment step to restart network.service when openvswitch is restarted.

The puppet manifest vswitch::dpdk is responsible for setting the DPDK_OPTIONS and restarting the openvswitch. Need to analyze on how to incorporate this dependency in the deployment.
Comment 6 Saravanan KR 2017-02-27 09:51:32 EST
Along with OvS2.6 migration, we are changing the flow of initializing DPDK, with which, we don't need the restart. Once we are ready with the list of changes, we will update the BZ.
Comment 7 Vijay Chundury 2017-03-02 06:39:24 EST
This scenario can be validated by QA by the document provided by karthik.

https://docs.google.com/a/redhat.com/document/d/1VhpoBcKj5oVZqXUoDPUKh_g43ITUazZcSl-GiJOxYUs/edit?usp=sharing.

Request the QE team to talk to Karthik before testing this scenario to be in sync with the document.
Comment 8 Eyal Dannon 2017-05-01 02:31:59 EDT
Hi,

After deployment br-link got IP addr

15: br-link: <BROADCAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether 14:02:ec:7c:87:7c brd ff:ff:ff:ff:ff:ff
    inet 10.35.141.21/28 brd 10.35.141.31 scope global br-link
       valid_lft forever preferred_lft forever
    inet6 fe80::1602:ecff:fe7c:877c/64 scope link 
       valid_lft forever preferred_lft forever

I've verified this bug using template: 
https://code.engineering.redhat.com/gerrit/gitweb?p=nfv-qe.git;a=tree;f=heat-templates-configs/samples/ospd-11-vxlan-dpdk-single-port-ctlplane-bonding

Puddle:2017-04-24.2
Comment 9 errata-xmlrpc 2017-05-17 15:51:03 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245

Note You need to log in before you can comment on or make changes to this bug.