Bug 1557526

Summary: [Deployment] with odl setup, nova failed to attach guest port to flat provider network
Product: Red Hat OpenStack Reporter: jianzzha
Component: puppet-neutronAssignee: Tim Rozet <trozet>
Status: CLOSED ERRATA QA Contact: Itzik Brown <itbrown>
Severity: high Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: aadam, amuller, berrange, chrisw, dasmith, eglynn, jhakimra, jianzzha, jjoyce, jschluet, kchamart, mkolesni, nyechiel, sbauza, sferdjao, sgordon, slinaber, srevivo, trozet, tvignaud, vromanso
Target Milestone: betaKeywords: Triaged
Target Release: 13.0 (Queens)   
Hardware: x86_64   
OS: Linux   
Whiteboard: odl_deployment
Fixed In Version: puppet-neutron-12.4.1-0.20180412211912.78a3933.el7ost openstack-tripleo-heat-templates-8.0.2-0.20180414062830.5f869f2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
N/A
Last Closed: 2018-06-27 13:47:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
karaf.log none

Description jianzzha 2018-03-16 19:40:18 UTC
Description of problem:
with odl as neutron plugin, nova fails to attach guest flat provider network

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1. neutron net-create provider-nfv1 --provider:network_type flat --provider:physical_network dpdk1 --port_security_enabled=False
2. neutron subnet-create --name provider-nfv1 --disable-dhcp --gateway 20.0.0.1 provider-nfv1 20.1.0.0/16  
3. provider1=$(openstack port create --network provider-nfv1 nfv1-port | awk '/ id/ {print $4}')
4. openstack server create --flavor nfv --image ${vm_image_name} --nic port-id=$provider1 --key-name demo-key demo1
5. error in nova: message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance ea8be253-27b6-4366-8b71-6bcee6184f87. Last exception: Binding failed for port 23e05c09-3b9c-4ca3-96c7-5cfeff20be3d, please check neutron logs for more information.", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 549, in build_instances

Actual results:
without odl, nova was able to attach guest port to flat network; same behavor was expected for with odl as neutron plugin. If we change the flat network to vlan network, then it works fine.

Expected results:
nova should be able to attach guest port to flat network

Comment 1 jianzzha 2018-03-16 19:57:58 UTC
2018-03-16 19:15:10.844 110318 WARNING networking_odl.ml2.pseudo_agentdb_binding [req-a0cf9392-eb40-4994-a696-3801b92e9fca f2ef06f25fa044c6a5029390354f67ea c6e9455f0ded46968ac11087da3b47df - default default] Failed to bind Port 23e05c09-3b9c-4ca3-96c7-5cfeff20be3d devid ea8be253-27b6-4366-8b71-6bcee6184f87 owner compute:nova for host compute-0.localdomain on network 9efa7bb9-97c3-4bdd-bc97-018d65bb7b10.
2018-03-16 19:15:10.844 110318 WARNING networking_odl.ml2.pseudo_agentdb_binding [req-a0cf9392-eb40-4994-a696-3801b92e9fca f2ef06f25fa044c6a5029390354f67ea c6e9455f0ded46968ac11087da3b47df - default default] No ODL hostconfigs for host compute-0.localdomain found in agentdb
2018-03-16 19:15:10.844 110318 ERROR neutron.plugins.ml2.managers [req-a0cf9392-eb40-4994-a696-3801b92e9fca f2ef06f25fa044c6a5029390354f67ea c6e9455f0ded46968ac11087da3b47df - default default] Failed to bind port 23e05c09-3b9c-4ca3-96c7-5cfeff20be3d on host compute-0.localdomain for vnic_type normal using segments [{'network_id': '9efa7bb9-97c3-4bdd-bc97-018d65bb7b10', 'segmentation_id': None, 'physical_network': u'dpdk0', 'id': 'af7c7acc-d283-480a-baf8-a8589265e025', 'network_type': u'flat'}]

Comment 2 Mike Kolesnik 2018-04-02 13:16:44 UTC
I see you had an error "No ODL hostconfigs for host compute-0.localdomain found in agentdb"

How did you configure the deployment to use the flat provider network?

Also please attach debug logs for the karaf and neutron servers when this problem occurs.

Comment 3 jianzzha 2018-04-04 16:46:04 UTC
Mike, I will get the info back to you once the testbed is done the current job

Comment 5 jianzzha 2018-04-08 04:37:04 UTC
In neutron log I noticed the flat network is not on the allowed_network_types list.

2018-04-08 04:17:41.263 727599 DEBUG networking_odl.ml2.pseudo_agentdb_binding [req-e816cedb-cc8a-42fb-9cff-8ef58f7a4e3a - - - - -] ODLPORTBINDING hostconfigs:
[
    "{\"hostconfigs\":{\"hostconfig\":[{\"host-id\":\"controller-0.localdomain\",\"host-type\":\"ODL L2\",\"config\":\"{  \\\"supported_vnic_types\\\": ",
    "[{    \\\"vnic_type\\\": \\\"normal\\\",    \\\"vif_type\\\": \\\"ovs\\\",    \\\"vif_details\\\": {}  }],  \\\"allowed_network_types\\\": [\\\"local\\\",\\\"",
    "vlan\\\",\\\"vxlan\\\",\\\"gre\\\"],  \\\"bridge_mappings\\\": {\\\"dpdk0\\\":\\\"br-link0\\\",\\\"dpdk1\\\":\\\"br-link1\\\",\\\"access\\\":\\\"br-p2p1\\\"}}\"},{\"hos",
    "t-id\":\"compute-0.localdomain\",\"host-type\":\"ODL L2\",\"config\":\"{  \\\"supported_vnic_types\\\": [{    \\\"vnic_type\\\": \\\"normal\\\",    \\\"",
    "vif_type\\\": \\\"vhostuser\\\",    \\\"vif_details\\\": {      \\\"uuid\\\": \\\"10defc86-796f-4476-a617-4fade093ce2e\\\",      \\\"has_datapath_ty",
    "pe_netdev\\\": true,      \\\"port_prefix\\\": \\\"vhu\\\",      \\\"vhostuser_socket_dir\\\": \\\"/var/run/openvswitch\\\",      \\\"vhostuser_ovs_",
    "plug\\\": true,      \\\"vhostuser_mode\\\": \\\"client\\\",      \\\"vhostuser_socket\\\": \\\"/var/run/openvswitch/vhu$PORT_ID\\\"    }  }],  \\\"",
    "allowed_network_types\\\": [\\\"local\\\",\\\"vlan\\\",\\\"vxlan\\\",\\\"gre\\\"],  \\\"bridge_mappings\\\": {\\\"dpdk0\\\":\\\"br-link0\\\",\\\"dpdk1\\\":\\\"br-li",
    "nk1\\\",\\\"access\\\":\\\"br-p2p1\\\"}}\"}]}}"

Comment 6 jianzzha 2018-04-08 04:38:19 UTC
Created attachment 1418776 [details]
karaf.log

Comment 7 jianzzha 2018-04-09 12:47:43 UTC
script steps:
1. create network and subnet
    neutron net-create provider-nfv$i ${provider_opt} \
                                      --provider:physical_network dpdk$(($i % 2)) \
                                      --port_security_enabled=False

    neutron subnet-create --name provider-nfv$i \
                        --disable-dhcp \
                        --gateway 20.$i.0.1 \
                        provider-nfv$i 20.$i.0.0/16

2. create port on the network
  provider1=$(openstack port create --network provider-nfv$((i - 1)) ${vnic_option} nfv$((i - 1))-port | awk '/ id/ {print $4}')
  provider2=$(openstack port create --network provider-nfv$i ${vnic_option} nfv$i-port | awk '/ id/ {print $4}')


3. start instance using these ports,
  openstack server create --flavor nfv \
                          --image ${vm_image_name} \
                          --nic port-id="$id3" \
                          --nic port-id="$id1" \
                          --nic port-id="$id2" \
                          --key-name demo-key \
                          $opt $name


It works fine if using vlan provider network.

Comment 8 jianzzha 2018-04-09 12:50:54 UTC
(overcloud) [stack@perf98 nfv-scripts]$ openstack network show provider-nfv1
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | UP                                   |
| availability_zone_hints   |                                      |
| availability_zones        | nova                                 |
| created_at                | 2018-04-09T03:24:08Z                 |
| description               |                                      |
| dns_domain                | None                                 |
| id                        | d09bf38d-4ebd-4c66-aad4-a3ef4363d50e |
| ipv4_address_scope        | None                                 |
| ipv6_address_scope        | None                                 |
| is_default                | None                                 |
| is_vlan_transparent       | None                                 |
| mtu                       | 9000                                 |
| name                      | provider-nfv1                        |
| port_security_enabled     | False                                |
| project_id                | 2ba030aba24b4d7ead9e891dd294f456     |
| provider:network_type     | flat                                 |
| provider:physical_network | dpdk1                                |
| provider:segmentation_id  | None                                 |
| qos_policy_id             | None                                 |
| revision_number           | 3                                    |
| router:external           | Internal                             |
| segments                  | None                                 |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   | 7987d64a-3da1-42d1-93b0-f0a392b8e8a8 |
| tags                      |                                      |
| updated_at                | 2018-04-09T03:24:10Z                 |
+---------------------------+--------------------------------------+

(overcloud) [stack@perf98 nfv-scripts]$ openstack port show nfv1-port
+-----------------------+-------------------------------------------------------------------------+
| Field                 | Value                                                                   |
+-----------------------+-------------------------------------------------------------------------+
| admin_state_up        | UP                                                                      |
| allowed_address_pairs |                                                                         |
| binding_host_id       |                                                                         |
| binding_profile       |                                                                         |
| binding_vif_details   |                                                                         |
| binding_vif_type      | unbound                                                                 |
| binding_vnic_type     | normal                                                                  |
| created_at            | 2018-04-09T03:24:15Z                                                    |
| data_plane_status     | None                                                                    |
| description           |                                                                         |
| device_id             |                                                                         |
| device_owner          |                                                                         |
| dns_assignment        | None                                                                    |
| dns_name              | None                                                                    |
| extra_dhcp_opts       |                                                                         |
| fixed_ips             | ip_address='20.1.0.5', subnet_id='7987d64a-3da1-42d1-93b0-f0a392b8e8a8' |
| id                    | 20fbbf6e-a491-4780-a031-b0f144846cfc                                    |
| ip_address            | None                                                                    |
| mac_address           | fa:16:3e:9f:2e:a4                                                       |
| name                  | nfv1-port                                                               |
| network_id            | d09bf38d-4ebd-4c66-aad4-a3ef4363d50e                                    |
| option_name           | None                                                                    |
| option_value          | None                                                                    |
| port_security_enabled | False                                                                   |
| project_id            | 2ba030aba24b4d7ead9e891dd294f456                                        |
| qos_policy_id         | None                                                                    |
| revision_number       | 5                                                                       |
| security_group_ids    |                                                                         |
| status                | DOWN                                                                    |
| subnet_id             | None                                                                    |
| tags                  |                                                                         |
| trunk_details         | None                                                                    |
| updated_at            | 2018-04-09T03:24:49Z                                                    |
+-----------------------+-------------------------------------------------------------------------+
(overcloud) [stack@perf98 nfv-scripts]$

Comment 9 jianzzha 2018-04-09 12:53:16 UTC
also, in non-odl deployment, the flat provider network works fine too

Comment 10 jianzzha 2018-04-09 14:29:09 UTC
Looking at networking_odl/ml2/pseudo_agentdb_binding.py:_hconfig_bind_port, it's looking for a segment ID even for the flat network, that's why the port bind failed.

Comment 11 Tim Rozet 2018-04-09 18:01:13 UTC
As Mike mentioned the issue is 'flat' is missing in the allowed network types for host config by default.

Comment 12 jianzzha 2018-04-09 21:12:46 UTC
Tim, Mike, the flow table rule might not be right on the flat network, as I wasn't able to pass traffic through even though the port was attached to the guest successfully.

In my setup, the only port which was able accessible is the access port which use the vlan provider network.

So beyond the configuration fix, something else might need to be done as well (maybe on a seperate BZ). I will let you know once I have more info.

Comment 13 jianzzha 2018-04-10 02:23:20 UTC
[root@compute-0 ~]# ovs-ofctl -OOpenFlow13 dump-ports-desc br-intOFPST_PORT_DESC reply (OF1.3) (xid=0x2):
 1(br-p2p1-patch): addr:9e:ce:e9:50:42:5a
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 2(br-link1-patch): addr:a6:27:bf:32:a1:5c
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 3(br-link0-patch): addr:52:42:b1:53:2c:aa
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 34(vhubb6a72d6-c6): addr:00:00:00:00:00:00
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 35(vhu67004112-1f): addr:00:00:00:00:00:00
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 36(vhu99356bd0-b7): addr:00:00:00:00:00:00
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br-int): addr:54:60:75:97:76:34
     config:     PORT_DOWN
     state:      LINK_DOWN
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
[root@compute-0 ~]# ovs-appctl ofproto/trace br-int in_port=3,arp,dl_dst=ff:ff:ff:ff:ff:ff
Flow: arp,in_port=3,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=0.0.0.0,arp_tpa=0.0.0.0,arp_op=0,arp_sha=00:00:00:00:00:00,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
 0. in_port=3,vlan_tci=0x0000/0x1fff, priority 4, cookie 0x8000000
    write_metadata:0x80000000001/0xffffff0000000001
    goto_table:17
17. No match.
    drop

Final flow: arp,metadata=0x80000000001,in_port=3,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=0.0.0.0,arp_tpa=0.0.0.0,arp_op=0,arp_sha=00:00:00:00:00:00,arp_tha=00:00:00:00:00:00
Megaflow: recirc_id=0,arp,in_port=3,vlan_tci=0x0000/0x1fff
Datapath actions: drop
[root@compute-0 ~]#

Comment 14 Mike Kolesnik 2018-04-10 12:02:41 UTC
(In reply to jianzzha from comment #12)
> Tim, Mike, the flow table rule might not be right on the flat network, as I
> wasn't able to pass traffic through even though the port was attached to the
> guest successfully.
> 
> In my setup, the only port which was able accessible is the access port
> which use the vlan provider network.
> 
> So beyond the configuration fix, something else might need to be done as
> well (maybe on a seperate BZ). I will let you know once I have more info.

Let's use this Bz to track the Triple-O fix, and please file another one that depends on this one if the flat network attached VM doesn't actually work, since that one will be on the opendaylight component and would probably be fixed by someone other than Tim.

Comment 15 Tim Rozet 2018-04-10 12:58:55 UTC
I agree with Mike.  Let's open a separate BZ for that if necessary.  But I'm wondering, why was I able to ping the VM and ssh into it if traffic is not working?  I used the 10.1.1.x subnet, I thought that was the flat port?

Comment 16 jianzzha 2018-04-11 00:44:58 UTC
Tim, when you did the ping and ssh, it used the access network with was on vlan network; the two other flat network ports are for data traffic. I failed to send traffic though the data ports even though the access port works. 

I agree that's a separate issue.

Comment 19 Itzik Brown 2018-05-22 08:32:51 UTC
Checked with:
puppet-neutron-12.4.1-0.20180412211913.el7ost.noarch

Ran:
# neutron net-create provider-nfv1 --provider:network_type flat --provider:physical_network datacentre --port_security_enabled=False
# neutron subnet-create --name provider-nfv1 --disable-dhcp --gateway 20.0.0.1 provider-nfv1 20.1.0.0/16
# provider1=$(openstack port create --network provider-nfv1 nfv1-port | awk '/ id/ {print $4}')
# openstack server create --flavor rhel --image  fa056ae3-680c-4657-b3ac-

The instance is up.

Comment 21 errata-xmlrpc 2018-06-27 13:47:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086