Bug 1404567 - VxLAN setup with DPDK - Compute nodes needs to be restarted
Summary: VxLAN setup with DPDK - Compute nodes needs to be restarted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: Upstream M3
: 11.0 (Ocata)
Assignee: Jaganathan Palanisamy
QA Contact: Yariv
URL:
Whiteboard:
Depends On:
Blocks: 1406865 1413578
TreeView+ depends on / blocked
 
Reported: 2016-12-14 06:50 UTC by Karthik Sundaravel
Modified: 2017-05-17 19:51 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-17 19:51:03 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC

Description Karthik Sundaravel 2016-12-14 06:50:49 UTC
nic configs for DPDK bridge in compute node :
            -
              type: ovs_user_bridge
              name: br-link
              use_dhcp: false
              addresses:
                -
                  ip_netmask: {get_param: TenantIpSubnet}
              members:
                -
                  type: ovs_dpdk_port
                  name: dpdk0
                  members:
                    -
                      type: interface
                      name: nic4

After deployment br-link in compute node is not up (loses the static IP). A reboot is required to workaround the issue.

Comment 1 Saravanan KR 2016-12-15 07:04:53 UTC
While restarting openvswitch "systemctl restart openvswitch", (restart is required after setting the DPDK_OPTIONS), on the compute node, the ovs user bridge "br-link" is loosing the IP. We have captured the logs [2], where restart is issued at line #103. Can you please let us know if we are missing any configuration in the deployment? Let us know if you need access to the TripleO environment.

neutron/openvswitch-agent.log - http://pastebin.test.redhat.com/439103
openvswitch/ovs-vswitchd.log - http://pastebin.test.redhat.com/439102

[1] https://github.com/krsacme/tht-dpdk/blob/rhosp10k3/nic-configs/computeovsdpdk.yaml
[2] http://pastebin.test.redhat.com/439097

Comment 2 Saravanan KR 2016-12-15 07:05:53 UTC
Aaron Conole's comments (in mail):

Did you set the datapath type correctly?  I didn't see so in the logs,
but you'll need to issue:

   ovs-vsctl set bridge br-link datapath_type=netdev

since it contains a port called dpdk0.  I didn't see anything else that
stood out as being wrong.  If the above doesn't correct it (and I
suggest restarting the ovs and address acquisition software once making
that change to be sure), can you capture an sosreport?

Comment 3 Sanjay Upadhyay 2016-12-15 07:35:10 UTC
Created attachment 1231990 [details]
sosreport of the compute node

[root@overcloud-compute-0 ~]# ovs-vsctl list bridge | egrep "name|datapath_type"
datapath_type       : netdev
name                : br-ex
datapath_type       : netdev
name                : br-int
datapath_type       : netdev
name                : br-tun
datapath_type       : netdev
name                : br-link

Comment 4 Karthik Sundaravel 2017-01-06 07:21:11 UTC
We find that the OVS bridge br-link loses the IP when openvswitch is restarted (after configuring the DPDK_OPTIONS).

As a workaround we either reboot the compute node or follow the below steps

1. ifup br-link
2. systemctl restart neutron-openvswitch-agent.

Comment 5 Saravanan KR 2017-01-10 05:24:04 UTC
This issue is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1373085. Thanks Peng for pointing out. 

As per the discussion, if the openvswitch is restarted, then the service which is assigning IP to the ovs bridge also should be restarted, which in our case would be network.service.

We have manually restarted the network.service and found that the br-link gets the IP successfully. Now, we need to add this part of the deployment step to restart network.service when openvswitch is restarted.

The puppet manifest vswitch::dpdk is responsible for setting the DPDK_OPTIONS and restarting the openvswitch. Need to analyze on how to incorporate this dependency in the deployment.

Comment 6 Saravanan KR 2017-02-27 14:51:32 UTC
Along with OvS2.6 migration, we are changing the flow of initializing DPDK, with which, we don't need the restart. Once we are ready with the list of changes, we will update the BZ.

Comment 7 Vijay Chundury 2017-03-02 11:39:24 UTC
This scenario can be validated by QA by the document provided by karthik.

https://docs.google.com/a/redhat.com/document/d/1VhpoBcKj5oVZqXUoDPUKh_g43ITUazZcSl-GiJOxYUs/edit?usp=sharing.

Request the QE team to talk to Karthik before testing this scenario to be in sync with the document.

Comment 8 Eyal Dannon 2017-05-01 06:31:59 UTC
Hi,

After deployment br-link got IP addr

15: br-link: <BROADCAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether 14:02:ec:7c:87:7c brd ff:ff:ff:ff:ff:ff
    inet 10.35.141.21/28 brd 10.35.141.31 scope global br-link
       valid_lft forever preferred_lft forever
    inet6 fe80::1602:ecff:fe7c:877c/64 scope link 
       valid_lft forever preferred_lft forever

I've verified this bug using template: 
https://code.engineering.redhat.com/gerrit/gitweb?p=nfv-qe.git;a=tree;f=heat-templates-configs/samples/ospd-11-vxlan-dpdk-single-port-ctlplane-bonding

Puddle:2017-04-24.2

Comment 9 errata-xmlrpc 2017-05-17 19:51:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.