Created attachment 1319084 [details] log Description of problem: Trying to deploy OSP-12 containerize DPDK setup and it failed in overcloud_deploy. openstack stack failures list --long overcloud overcloud.Controller.0.Controller: resource_type: OS::TripleO::ControllerServer physical_resource_id: ede3cf15-33a1-49f9-afcc-286311c51f49 status: CREATE_FAILED status_reason: | CREATE aborted overcloud.Compute.0.NetworkDeployment: resource_type: OS::TripleO::SoftwareDeployment physical_resource_id: da87ee85-b993-4046-87b2-e2ed4acbe500 status: CREATE_FAILED status_reason: | Error: resources.NetworkDeployment: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 traceback from compute node: raceback (most recent call last): os-collect-config: File "/usr/bin/os-net-config", line 10, in <module> localhost os-collect-config: sys.exit(main()) localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/cli.py", line 204, in main localhost os-collect-config: provider.add_object(obj) localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 68, in add_object localhost os-collect-config: self.add_object(member) localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/__init__.py", line 100, in add_object localhost os-collect-config: self.add_ovs_dpdk_port(obj) localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/impl_ifcfg.py", line 639, in add_ovs_dpdk_port localhost os-collect-config: utils.bind_dpdk_interfaces(ifname, ovs_dpdk_port.driver, self.noop) Aug 28 08:00:54 localhost os-collect-config: File "/usr/lib/python2.7/site-packages/os_net_config/utils.py", line 240, in bind_dpdk_interfaces Aug 28 08:00:54 localhost os-collect-config: raise OvsDpdkBindException(msg) Aug 28 08:00:54 localhost os-collect-config: os_net_config.utils.OvsDpdkBindException: Failed to bind interface p1p1 with dpdk Version-Release number of selected component (if applicable): OSP-12 puppet-neutron-11.3.0-0.20170805104936.743dde6.el7ost.noarch python-neutronclient-6.5.0-0.20170807200849.355983d.el7ost.noarch openstack-neutron-lbaas-11.0.0-0.20170807144457.c9adfd4.el7ost.noarch python-neutron-11.0.0-0.20170807223712.el7ost.noarch python-neutron-lbaas-11.0.0-0.20170807144457.c9adfd4.el7ost.noarch openstack-neutron-linuxbridge-11.0.0-0.20170807223712.el7ost.noarch openstack-neutron-ml2-11.0.0-0.20170807223712.el7ost.noarch openstack-neutron-11.0.0-0.20170807223712.el7ost.noarch openstack-neutron-sriov-nic-agent-11.0.0-0.20170807223712.el7ost.noarch python-neutron-lib-1.9.1-0.20170731102145.0ef54c3.el7ost.noarch openstack-neutron-metering-agent-11.0.0-0.20170807223712.el7ost.noarch openstack-neutron-common-11.0.0-0.20170807223712.el7ost.noarch openstack-neutron-openvswitch-11.0.0-0.20170807223712.el7ost.noarch How reproducible: always Steps to Reproduce: 1.run OSP-12 dpdk deployment use the templates that attached 2. 3. Actual results: failed Expected results: Additional info:
Created attachment 1319085 [details] logs and commands
From /var/log/messages Aug 28 07:58:32 localhost dhclient[1494]: receive_packet failed on p1p1: Network is down Aug 28 07:58:32 localhost NetworkManager[1048]: <info> [1503921512.0951] device (p1p1): state change: ip-config -> unmanaged (reason 'removed') [70 10 36] Aug 28 07:58:32 localhost kernel: ixgbe 0000:05:00.0: complete Aug 28 07:58:32 localhost kernel: vfio-pci: probe of 0000:05:00.0 failed with error -22 Manual invocation of "driverctl --debug set-override 0000:05:00.0 vfio-pci" results in: driverctl: setting driver override for 0000:05:00.0: vfio-pci driverctl: loading driver vfio-pci driverctl: unbinding previous driver ixgbe driverctl: reprobing driver for 0000:05:00.0 driverctl: failed to bind device 0000:05:00.0 to driver vfio-pci Saravanan can you provide some advice on trouble shooting/debugging?
One of the reason for driverctl to fail is because of missing kernel args (iommu and hugepages). With the sosreport, the grub parameters looks are not configured correctly on the compute node. Could be issue with templates. I have a working version of osp12 puddle based containerized DPDK deployment. Here are the templates for your reference. - https://github.com/krsacme/tht-dpdk/tree/osp11_to_12/osp12_ref
(In reply to Saravanan KR from comment #4) > One of the reason for driverctl to fail is because of missing kernel args > (iommu and hugepages). With the sosreport, the grub parameters looks are not > configured correctly on the compute node. Could be issue with templates. > > I have a working version of osp12 puddle based containerized DPDK > deployment. Here are the templates for your reference. - > https://github.com/krsacme/tht-dpdk/tree/osp11_to_12/osp12_ref Fixed the issue of kernel args in network-environment.yaml file and the problem still exists with the same error. When I tried to deploy OSP-12 dpdk not containerize setup its success. When I am trying to deploy it with container it failed. any advice?
I am still working on 1486127 which is a essential to get DPDK working on containerized deployment. Once completed, i will update the steps to deploy for DPDK.
(In reply to Saravanan KR from comment #6) > I am still working on 1486127 which is a essential to get DPDK working on > containerized deployment. Once completed, i will update the steps to deploy > for DPDK. Is this DUP of the other BZ?
I am closing it as as #1486127 is fixed and it is verified by QE. Please reopen if the issue is still not solved. *** This bug has been marked as a duplicate of bug 1486127 ***