Bug 1900240
| Summary: | OVS DPDK bond LACP fails during provisioning | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | jpateteg | ||||
| Component: | openvswitch | Assignee: | Open vSwitch development team <ovs-team> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Eran Kuris <ekuris> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 13.0 (Queens) | CC: | apevec, cfields, chrisw, jlibosva, rhos-maint | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-12-16 16:56:55 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
openvswitch process [root@mxtla01lab2com03 openvswitch]# ps -ef | grep openvsw openvsw+ 4073 1 0 Nov20 ? 00:00:03 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach openvsw+ 4130 1 0 Nov20 ? 00:01:25 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach root 27384 27371 0 12:05 pts/0 00:00:00 grep --color=auto openvsw openvswitch logs: [root@mxtla01lab2com03 openvswitch]# cat ovs-vswitchd.log 2020-11-21T08:25:01.915Z|00198|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log 2020-11-21T16:28:33.467Z|00199|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.468Z|00200|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.468Z|00201|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.468Z|00202|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.587Z|00203|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.587Z|00204|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.587Z|00205|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.587Z|00206|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.685Z|00207|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.685Z|00208|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.685Z|00209|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.685Z|00210|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.799Z|00211|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.799Z|00212|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.799Z|00213|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.799Z|00214|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.845Z|00215|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.845Z|00216|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.845Z|00217|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.845Z|00218|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.859Z|00219|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.859Z|00220|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.859Z|00221|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.859Z|00222|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.965Z|00223|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.965Z|00224|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.965Z|00225|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.965Z|00226|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:33.978Z|00227|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:33.978Z|00228|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:33.978Z|00229|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:33.978Z|00230|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:34.589Z|00231|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:34.589Z|00232|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:34.589Z|00233|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:34.589Z|00234|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) 2020-11-21T16:28:34.597Z|00235|netdev|WARN|could not create netdev dpdk0 of unknown type dpdk 2020-11-21T16:28:34.597Z|00236|bridge|WARN|could not open network device dpdk0 (Address family not supported by protocol) 2020-11-21T16:28:34.597Z|00237|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk 2020-11-21T16:28:34.597Z|00238|bridge|WARN|could not open network device dpdk1 (Address family not supported by protocol) [root@mxtla01lab2com03 openvswitch]# config.json (os-net-config)
{"network_config": [{"addresses": [{"ip_netmask": "14.195.13.148/25"}], "bonding_options": "mode=1 miimon=100", "members": [{"name": "ens1f0", "primary": true, "type": "interface", "use_dhcp": false}, {"name": "ens1f1", "type": "interface", "use_dhcp": false}], "name": "bond0", "routes": [{"ip_netmask": "169.254.169.254/32", "next_hop": "14.195.13.135"}], "type": "linux_bond", "use_dhcp": false}, {"bonding_options": "mode=1 miimon=100", "members": [{"mtu": 9000, "name": "ens1f4", "primary": true, "type": "interface", "use_dhcp": false}, {"mtu": 9000, "name": "ens1f5", "type": "interface", "use_dhcp": false}], "mtu": 9000, "name": "bond1", "type": "linux_bond", "use_dhcp": false}, {"addresses": [{"ip_netmask": "10.3.0.10/25"}], "device": "bond1", "mtu": 9000, "type": "vlan", "vlan_id": 456}, {"bonding_options": "mode=1 miimon=100", "members": [{"name": "ens1f6", "primary": true, "type": "interface", "use_dhcp": false}, {"name": "ens1f7", "type": "interface", "use_dhcp": false}], "name": "bond2", "type": "linux_bond", "use_dhcp": false}, {"addresses": [{"ip_netmask": "14.195.11.142/25"}], "device": "bond2", "routes": [{"default": true, "next_hop": "14.195.11.129"}], "type": "vlan", "vlan_id": 440}, {"addresses": [{"ip_netmask": "10.1.0.10/25"}], "device": "bond2", "type": "vlan", "vlan_id": 454}, {"addresses": [{"ip_netmask": "10.2.0.10/25"}], "members": [{"members": [{"members": [{"name": "ens2f0", "type": "interface"}], "mtu": 9000, "name": "dpdk0", "type": "ovs_dpdk_port"}, {"members": [{"name": "ens2f1", "type": "interface"}], "mtu": 9000, "name": "dpdk1", "type": "ovs_dpdk_port"}], "mtu": 9000, "name": "dpdkbond0", "ovs_options": "bond_mode=balance-slb lacp=active", "rx_queue": 2, "type": "ovs_dpdk_bond"}], "mtu": 9000, "name": "br-dpdk0", "ovs_extra": ["set port br-dpdk0 tag=455"], "type": "ovs_user_bridge", "use_dhcp": false}, {"defroute": false, "mtu": 9000, "name": "ens3f0", "type": "interface", "use_dhcp": false}, {"defroute": false, "mtu": 9000, "name": "ens3f1", "type": "interface", "use_dhcp": false}]}
Created attachment 1731788 [details]
journalctl os-net-config
This is the output of journalctl for os-net-config
Deployment Failure as a consequence of the connectivity issue derived from the lack of the bond
2020-11-21 16:40:49Z [overcloud-ComputeOvsDpdkAllNodesValidationDeployment-oo763hmyjlvr.1]: SIGNAL_IN_PROGRESS Signal: deployment fffbd117-75f1-4217-a01f-7c3b7d14e39b failed (1)
2020-11-21 16:40:50Z [overcloud-ComputeOvsDpdkAllNodesValidationDeployment-oo763hmyjlvr.1]: CREATE_FAILED Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
2020-11-21 16:40:50Z [overcloud-ComputeOvsDpdkAllNodesValidationDeployment-oo763hmyjlvr]: UPDATE_FAILED Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
2020-11-21 16:40:50Z [ComputeOvsDpdkAllNodesValidationDeployment]: UPDATE_FAILED resources.ComputeOvsDpdkAllNodesValidationDeployment: Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
2020-11-21 16:40:50Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: resources.ComputeOvsDpdkAllNodesValidationDeployment: Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
Stack overcloud UPDATE_FAILED
overcloud.ComputeOvsDpdkAllNodesValidationDeployment.1:
resource_type: OS::Heat::StructuredDeployment
physical_resource_id: fffbd117-75f1-4217-a01f-7c3b7d14e39b
status: CREATE_FAILED
status_reason: |
Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
deploy_stdout: |
...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
Ping to 10.2.0.10 failed. Retrying...
FAILURE
(truncated, view all with --long)
deploy_stderr: |
10.2.0.10 is not pingable. Local Network: 10.2.0.0/25
Heat Stack update failed.
Heat Stack update failed.
real 21m32.117s
user 0m5.091s
sys 0m0.508s
[stack@mxtlal01lab2dir ~]$
[stack@mxtlal01lab2dir ~]$
[stack@mxtlal01lab2dir ~]$
[stack@mxtlal01lab2dir ~]$
IP link show:
[heat-admin@mxtla01lab2com03 ~]$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 94:f1:28:a7:8f:32 brd ff:ff:ff:ff:ff:ff
3: ens3f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 94:f1:28:a7:8f:33 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6b brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6b brd ff:ff:ff:ff:ff:ff
6: ens1f4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6d brd ff:ff:ff:ff:ff:ff
7: ens1f5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6d brd ff:ff:ff:ff:ff:ff
8: ens1f6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond2 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6f brd ff:ff:ff:ff:ff:ff
9: ens1f7: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond2 state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6f brd ff:ff:ff:ff:ff:ff
12: enp1s0f4u4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
link/ether f6:fd:5c:69:09:7f brd ff:ff:ff:ff:ff:ff
13: ovs-netdev: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether a6:e5:a7:fc:a5:80 brd ff:ff:ff:ff:ff:ff
14: br-dpdk0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ea:ee:26:37:76:4c brd ff:ff:ff:ff:ff:ff
15: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6b brd ff:ff:ff:ff:ff:ff
16: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6d brd ff:ff:ff:ff:ff:ff
17: bond2: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6f brd ff:ff:ff:ff:ff:ff
18: vlan454@bond2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6f brd ff:ff:ff:ff:ff:ff
19: vlan440@bond2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6f brd ff:ff:ff:ff:ff:ff
20: vlan456@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 52:74:56:d0:00:6d brd ff:ff:ff:ff:ff:ff
Hello Jair, I'm focussing on these warnings: |00225|netdev|WARN|could not create netdev dpdk1 of unknown type dpdk Have you been including neutron-ovs-dpdk.yaml in the container image prepare and the overcloud deployment? If that's not the issue please open a support case and include an sosreport and also upload your overcloud templates and deploy command. Also please attach this bug to the case. Thanks CFields |
Description of problem: os-net-config fails to create an OVS Bond with LACP, hence, the connectivity tests fail and the overcloud deployment ends. Version-Release number of selected component (if applicable): ovs-vsctl (Open vSwitch) 2.11.0 rhosp 13 z12 How reproducible: 100% Steps to Reproduce: 1. Create an LACP bonding in the compute-dpdk template 2. run overcloud deploy command 3. Actual results: DPDK Bond is not created with the below ovs-vsctl show output Port "dpdkbond0" Interface "dpdk0" type: dpdk options: {dpdk-devargs="0000:af:00.0", n_rxq="2"} error: "could not open network device dpdk0 (Address family not supported by protocol)" Interface "dpdk1" type: dpdk options: {dpdk-devargs="0000:af:00.1", n_rxq="2"} error: "could not open network device dpdk1 (Address family not supported by protocol)" ovs_version: "2.11.0" Bond is not created: [root@mxtla01lab2com03 openvswitch]# ovs-appctl bond/list bond type recircID slaves [root@mxtla01lab2com03 openvswitch]# Expected results: The bond should be created Additional info: The TenantIp is in this NIC so the deployment do not succeed as there is no ping between compute nodes on this network. The Dpdk parameters I am using (on network-environment.yaml) ComputeOvsDpdkParameters: KernelArgs: default_hugepagesz=1GB hugepagesz=1G hugepages=400 iommu=pt intel_iommu=on isolcpus=1-23,25-47,49-71,73-95 TunedProfileName: "cpu-partitioning" IsolCpusList: "1-23,25-47,49-71,73-95" #Para el rol OVSDPDK, solo NUMA0 tiene DPDK, no es necesario aislar todos los CPUs. NovaVcpuPinSet: ['3-23,27-47,51-71,75-95'] NovaReservedHostMemory: 8192 OvsDpdkSocketMemory: "1024,4096" OvsDpdkMemoryChannels: "8" OvsDpdkCoreList: "0,24,48,72" # primer thread de cada core NovaComputeCpuSharedSet: ['0,24,48,72'] OvsPmdCoreList: "1,49,2,50,25,73,26,74" #2 CPUs con su sibling thread NeutronBridgeMappings: - datacentre:br-ex - tenant:br-dpdk0 NovaLibvirtRxQueueSize: 1024 NovaLibvirtTxQueueSize: 1024 NeutronDatapathType: netdev The NIC template piece: - type: ovs_user_bridge name: br-dpdk0 use_dhcp: false ovs_extra: - str_replace: template: set port br-dpdk0 tag=_VLAN_TAG_ params: _VLAN_TAG_: get_param: TenantNetworkVlanID addresses: - ip_netmask: get_param: TenantIpSubnet mtu: 9000 members: - type: ovs_dpdk_bond name: dpdkbond0 ovs_options: "bond_mode=balance-slb lacp=active" mtu: 9000 rx_queue: 2 members: - type: ovs_dpdk_port name: dpdk0 mtu: 9000 members: - type: interface name: ens2f0 - type: ovs_dpdk_port name: dpdk1 mtu: 9000 members: - type: interface name: ens2f1