Bug 1384562 - RHOS 10 DPDK instance unable to get ip address dhcp lease
Summary: RHOS 10 DPDK instance unable to get ip address dhcp lease
Keywords:
Status: CLOSED DUPLICATE of bug 1384774
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 10.0 (Newton)
Assignee: RHOS Documentation Team
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks: 1325680 1384774
TreeView+ depends on / blocked
 
Reported: 2016-10-13 14:34 UTC by Maxim Babushkin
Modified: 2016-11-17 16:18 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-08 14:03:39 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Maxim Babushkin 2016-10-13 14:34:51 UTC
Description of problem:
In RHOS 10 added support for DPDK configuration during the overcloud deploy.
Once overcloud with DPDK support deployed, the instance booted unable to get ip address lease.

Version-Release number of selected component (if applicable):
RHOS 10
Product version: 10
Product core version: 10
Product core build: 2016-10-06.1

openstack-neutron-openvswitch-9.0.0-0.20160929051647.71f2d2b.el7ost.noarch
openvswitch-2.5.0-5.git20160628.el7fdb.x86_64
python-openvswitch-2.5.0-5.git20160628.el7fdb.noarch

I have tried to update the openvswitch to the openvswitch-2.5.0-14.git20160727.el7fdp.x86_64.rpm version with the same result.


How reproducible:
1) Deploy the RHOS 10 with overcloud support.
All the configuration including dpdk interface binding, done during the deployment.
2) Boot the instance (VM)
The instance should get the ip address lease, but it don't.

Actual results:
DPDK instance should get the ip address from the network, DPDK interface binded to.

Expected results:
Instance unable to get the ip address lease.

Comment 3 Aaron Conole 2016-10-13 16:56:16 UTC
Please attach an sosreport, and include the output of dpdk_nic_bind.py; if possible also include the information in ovs-vswitchd.log (this log is included in newer versions of sosreport).

Comment 5 Assaf Muller 2016-10-14 00:20:36 UTC
(In reply to Maxim Babushkin from comment #0)
> Description of problem:
> In RHOS 10 added support for DPDK configuration during the overcloud deploy.
> Once overcloud with DPDK support deployed, the instance booted unable to get
> ip address lease.
> 
> Version-Release number of selected component (if applicable):
> RHOS 10
> Product version: 10
> Product core version: 10
> Product core build: 2016-10-06.1
> 
> openstack-neutron-openvswitch-9.0.0-0.20160929051647.71f2d2b.el7ost.noarch
> openvswitch-2.5.0-5.git20160628.el7fdb.x86_64
> python-openvswitch-2.5.0-5.git20160628.el7fdb.noarch
> 
> I have tried to update the openvswitch to the
> openvswitch-2.5.0-14.git20160727.el7fdp.x86_64.rpm version with the same
> result.
> 
> 
> How reproducible:
> 1) Deploy the RHOS 10 with overcloud support.
> All the configuration including dpdk interface binding, done during the
> deployment.
> 2) Boot the instance (VM)
> The instance should get the ip address lease, but it don't.
> 
> Actual results:
> DPDK instance should get the ip address from the network, DPDK interface
> binded to.
> 
> Expected results:
> Instance unable to get the ip address lease.

Was anyone from NFV QE able to use OSPd to install with OVS-DPDK support and successfully boot a VM with networking connectivity? Or is this the first time you've tried and hit this issue?

Also, can you provide an SOS report of controller nodes and the compute node the VM was scheduled to? Alternatively, can you supply SSH access information to an environment this reproduces in?

Comment 10 Maxim Babushkin 2016-10-15 23:02:52 UTC
This is the first time NFV QE team installs the OSPd with the OVS-DPDK support.

Requested log attached:
* Controller sosreport
* Compute sosreport
* dpdk_nic_bind output
* ovs-vswitchd.log file

Comment 11 Assaf Muller 2016-10-16 21:25:30 UTC
I'm retargeting back to OSP for now. We've had successful deployments of OSP with OVS-DPDK without the OSP Director integration, I'm assuming this is an integration issue and not an actual OVS DPDK issue.

@Maxim, please work with Vijay's team that implemented the OSP-d OVS-DPDK integration.

Comment 12 Vijay Chundury 2016-10-17 03:36:29 UTC
Guys, We would looking into this bug. Karthik S will be our point of contact.

Comment 13 Karthik Sundaravel 2016-10-17 15:59:11 UTC
After making 2 changes in the setup, I am able to boot instances and ping between them.

1. Need to configure a NIC in controller node for the DPDK provider network. Change is required in nic configs for controller node.

2. After deployment the br-int and br-link bridges are down on the controller and compute nodes. 
   ip l s dev br-link up
   ip l s dev br-int up

Comment 14 Karthik Sundaravel 2016-10-17 17:17:23 UTC
After discussing with Terry and Assaf, changes mentioned in point#2 is not required.

So the only change required will be in the templates, where the controller network configs shall also have a NIC included in the provider network

Ex:
   type: ovs_bridge
   name: br-link
   use_dhcp: false
   members:
     -
       type: interface
       name: nic4

Comment 15 Assaf Muller 2016-10-17 21:54:30 UTC
Looking at comment 14, this is NOTABUG. However, before closing, I'd like to make sure there are no actions items as far as documentation. Thoughts?

Comment 16 Vijay Chundury 2016-10-18 07:17:55 UTC
Assaf,
Agree with you.
This change or network configuration template is for the operator (documentation). So if somebody lets me know where this has to be placed will do it. Right now i will move bug status to ON_QA to verify the same.

Comment 17 Maxim Babushkin 2016-10-18 08:50:27 UTC
I tried to verify the deployment of the overcloud with the new template configuration.
I have modified the controller config as suggested by Karthik.
Once the deployment finished, I have booted the instance, but got the same behavior as before.

The instance unable to get the dhcp ip lease.

Comment 18 Karthik Sundaravel 2016-10-18 10:25:00 UTC
In the setup, the flavor key needs to be set for hugepages
nova flavor-key m1.nano set "hw:mem_page_size=large"

After this, I could see the DHCP requests from the guests are serviced, but the guest fails to get the keys from metadata server. I'll look in to the same.

Comment 19 Karthik Sundaravel 2016-10-18 16:03:30 UTC
At a highlevel, the compute node requires a reboot after modifying
/usr/lib/systemd/system/openvswitch-nonetwork.service and
/usr/share/openvswitch/scripts/ovs-ctl.


I've modified the first-boot scripts (http://pastebin.test.redhat.com/422080)
and post-install script (http://pastebin.test.redhat.com/422082) accordingly.

The post deployment commands to create guests can be found at http://pastebin.test.redhat.com/422083

With these changes, I am able to launch guests and ping between them.

Comment 20 Maxim Babushkin 2016-10-19 19:37:00 UTC
I have verified the deployment of the OVS DPDK feature within RHOS 10.

The steps required for the deployment are the following:
1) Create an ovs bridge dedicated for the dpdk with specified interface on compute.yaml template:
   Ex.
      -
        type: ovs_user_bridge
        name: br-link
        use_dhcp: false
        members:
          -
            type: ovs_dpdk_port
            name: dpdk0
            members:
              -
                type: interface
                name: nic3

2) Create an ovs bridge with the same name on the controller.yaml template:
   Ex.
      -
       type: ovs_bridge
       name: br-link
       use_dhcp: false
       members:
         -
           type: interface
           name: nic4

3) Use first-boot script (http://pastebin.test.redhat.com/422080)
   and post-install script (http://pastebin.test.redhat.com/422082) as provided by Karthik.
   Add a reference to scripts within the network-environment.yaml file.

4) Add neutron-ovs-dpdk.yaml file to the overcloud deploy command "-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml", and override the variables within the network-environment.yaml file.

5) Add ComputeKernelArgs arguments to network-environment.yaml file.
   ComputeKernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 intel_iommu=on"

6) Add NovaSchedulerDefaultFilters arguments to network-environment.yaml file.
   NovaSchedulerDefaultFilters: "RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter,NUMATopologyFilter"

7) After the overcloud deploy, apply flavor key on a flavor that will be used for the DPDK instance deploy.
   # nova flavor-key m1.nano set "hw:mem_page_size=large"

8) All the standard steps with the creation of the networks, flavor, security group rules that are not related to the dpdk feature deployment, provided by Karthik within the folloiwng link (http://pastebin.test.redhat.com/422083).

9) Boot the instances.


With all these steps, I was able to verify that two vm instances are booted, and available for ping and ssh access.

Comment 21 Assaf Muller 2016-10-19 20:27:06 UTC
Is there a known issue blocking us from marking this bug and the RFE this blocks as Verified?

Comment 22 Maxim Babushkin 2016-10-19 20:46:41 UTC
I'm not aware of any additional issues.
Just think we should add these guided steps to the documentation.

Comment 23 Assaf Muller 2016-10-19 20:50:11 UTC
(In reply to Maxim Babushkin from comment #22)
> I'm not aware of any additional issues.
> Just think we should add these guided steps to the documentation.

OK, I think the proper way forward is to mark this bug as Verified and copy the notes regarding documentation to https://bugzilla.redhat.com/show_bug.cgi?id=1325680.

Thoughts?

Comment 24 Maxim Babushkin 2016-10-19 21:23:37 UTC
Agree.


Note You need to log in before you can comment on or make changes to this bug.