Bug 1390199

Summary: Launching instance sometimes results in having 2 IP's from the same network.
Product: Red Hat OpenStack Reporter: Jeremy <jmelvin>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED CURRENTRELEASE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0 (Mitaka)CC: amuller, chrisw, dasmith, eglynn, jlibosva, jmelvin, kchamart, mwitt, sbauza, sgordon, smooney, srevivo, vromanso
Target Milestone: ---Keywords: Triaged, Unconfirmed, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-21 23:24:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeremy 2016-10-31 13:32:07 UTC
Description of problem: Launching instance with :nova boot --flavor 2 --boot-volume xxxxxx --nic net-id=613ef475-1de3-40ee-a8b2-4086855348f4 --security-group default --key-name snow_osp9 vnx63Ibug  , but the instance came up with 2 ip's assigned for some reason. 
[root@overcloud-controller-0 qemu]# nova list
+--------------------------------------+-------------------+---------+------------+-------------+-----------------------------------+
| ID                                   | Name              | Status  | Task State | Power State | Networks                          |
+--------------------------------------+-------------------+---------+------------+-------------+-----------------------------------+
| 0a4b139b-788f-44e5-acd6-a5529e08383f | vnx63Ibug         | ACTIVE  | -          | Running     | demo-net=172.20.1.61, 172.20.1.64 |
+--------------------------------------+-------------------+---------+------------+-------------+-----------------------------------+



Version-Release number of selected component (if applicable):
openstack-neutron-8.1.2-4.el7ost.noarch  

How reproducible:
unknown 

Steps to Reproduce:
1.launch instance
2. Notice 2 ips are sometimes assigned. 
3.

Actual results:
 instance is getting 2 ip's 

Expected results:
assigned 1 ip

Additional info:


root@overcloud-controller-0 qemu]# neutron subnet-list
+--------------------------------------+-------------+---------------+------------------------------------------------+
| id                                   | name        | cidr          | allocation_pools                               |
+--------------------------------------+-------------+---------------+------------------------------------------------+
| a6752c2b-d83c-4e87-a3dd-f1b4c425880b | ext-subnet  | 10.1.1.0/24   | {"start": "10.1.1.51", "end": "10.1.1.100"}    |
| c91dcbfd-939d-4a10-a9c9-895af8483641 | demo-subnet | 172.20.1.0/24 | {"start": "172.20.1.2", "end": "172.20.1.254"} |
+--------------------------------------+-------------+---------------+------------------------------------------------+
[root@overcloud-controller-0 qemu]# neutron net-list
+--------------------------------------+----------+----------------------------------------------------+
| id                                   | name     | subnets                                            |
+--------------------------------------+----------+----------------------------------------------------+
| 613ef475-1de3-40ee-a8b2-4086855348f4 | demo-net | c91dcbfd-939d-4a10-a9c9-895af8483641 172.20.1.0/24 |
| c886edb0-51ba-4ef0-9b1f-445eb081f766 | ext-net  | a6752c2b-d83c-4e87-a3dd-f1b4c425880b 10.1.1.0/24   |



[heat-admin@overcloud-controller-0 ~]$ neutron port-list | grep 172.20.1.6
| 1b7ca842-e052-4e03-a9af-59a1e30b91e0 |      | fa:16:3e:00:4a:88 | {"subnet_id": "c91dcbfd-939d-4a10-a9c9-895af8483641", "ip_address": "172.20.1.61"} |
| b58e8aaf-233b-47b0-8c15-22e35b6492f5 |      | fa:16:3e:d0:39:c2 | {"subnet_id": "c91dcbfd-939d-4a10-a9c9-895af8483641", "ip_address": "172.20.1.64"}


###sosreport-20161031-025411/overcloud-controller-0.localdomain/var/log/nova/nova-api.log

2016-10-25 08:52:59.784 16384 DEBUG keystoneauth.session [req-10e9fdd2-4f20-4f2e-9fc1-b6c6fc6d9d8c 347c3cc2f3834d93bef9b25989ef6e94 8ad128085c77427585de280439c4a283 - - -] REQ: curl -g -i -X GET http://172.16.2.4:9696/v2.0/ports.json?device_id=7c6cddc5-8729-4b44-a6c4-5b5d10591151&device_id=0a4b139b-788f-44e5-acd6-a5529e08383f&device_id=1f297472-c067-4b5d-b53a-878e9c61002c&device_id=2d704df6-67b7-44e2-88d2-8466c293968b&device_id=c71d6c95-5cc4-416a-8417-58be88232af5&device_id=551e619d-1d8d-4df9-9498-ea0e19272879&device_id=ed65e107-8dcc-4633-be1b-a85ea99a7601 -H "User-Agent: python-neutronclient" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}a45709ec42939f23d323da521765591212c3a083" _http_log_request /usr/lib/python2.7/site-packages/keystoneauth1/session.py:248
2016-10-25 08:52:59.846 16384 DEBUG keystoneauth.session [req-10e9fdd2-4f20-4f2e-9fc1-b6c6fc6d9d8c 347c3cc2f3834d93bef9b25989ef6e94 8ad128085c77427585de280439c4a283 - - -] RESP: [200] Content-Type: application/json; charset=UTF-8 Content-Length: 1834 X-Openstack-Request-Id: req-27a069e6-d61b-4d52-a204-641a1a7e07b9 Date: Tue, 25 Oct 2016 08:52:59 GMT Connection: keep-alive

RESP BODY: {"ports": [{"status": "DOWN", "binding:host_id": "overcloud-compute-0.localdomain", "description": "", "allowed_address_pairs": [], "extra_dhcp_opt
s": [], "updated_at": "-10-25T08:28:33", "device_owner": "compute:None", "port_security_enabled": true, "binding:profile": {}, "qos_policy_id": null, "fix
ed_ips": [{"subnet_id": "c91dcbfd-9201639d-4a10-a9c9-895af8483641", "ip_address": "172.20.1.61"}], "id": "1b7ca842-e052-4e03-a9af-59a1e30b91e0", "security_groups"
: ["00517d56-036b-40ad-8ef2-ece33d0b126a"], "device_id": "0a4b139b-788f-44e5-acd6-a5529e08383f", "name": "", "admin_state_up": true, "network_id": "613ef475-1
de3-40ee-a8b2-4086855348f4", "dns_name": null, "binding:vif_details": {"port_filter": true, "ovs_hybrid_plug": true}, "binding:vnic_type": "normal", "binding:vif_type": "ovs", "tenant_id": "820ba1b0db9645ae9411700292150957", "mac_address": "fa:16:3e:00:4a:88", "created_at": "2016-10-25T08:28:32"}, {"status": "ACTIVE", "binding:host_id": "overcloud-controller-0.localdomain", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-10-25T08:53:16", "device_owner": "compute:None", "port_security_enabled": true, "binding:profile": {}, "qos_policy_id": null, "fixed_ips": [{"subnet_id": "c91dcbfd-939d-4a10-a9c9-895af8483641", "ip_address": "172.20.1.64"}], "id": "b58e8aaf-233b-47b0-8c15-22e35b6492f5", "security_groups": ["00517d56-036b-40ad-8ef2-ece33d0b126a"], "device_id": "0a4b139b-788f-44e5-acd6-a5529e08383f", "name": "", "admin_state_up": true, "network_id": "613ef475-1de3-40ee-a8b2-4086855348f4", "dns_name": null, "binding:vif_details": {"port_filter": true, "ovs_hybrid_plug": true}, "binding:vnic_type": "normal", "binding:vif_type": "ovs", "tenant_id": "820ba1b0db9645ae9411700292150957", "mac_address": "fa:16:3e:d0:39:c2", "created_at": "2016-10-25T08:53:02"}]}


[jmelvin@collab-shell 01728610]$ grep 0a4b139b sosreport-20161031-025411/overcloud-controller-0.localdomain/var/log/neutron/dhcp-agent.log 
2016-10-25 08:28:33.256 16129 INFO neutron.agent.dhcp.agent [req-0393af55-3459-4c3d-a301-8f602f5d6d29 06545f81aa6b4907a18db94c8cbbeab7 8ad128085c77427585de280439c4a283 - - -] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[], binding:host_id=overcloud-compute-0.localdomain, binding:profile=, binding:vif_details=ovs_hybrid_plug=True, port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, created_at=2016-10-25T08:28:32, description=, device_id=0a4b139b-788f-44e5-acd6-a5529e08383f, device_owner=compute:None, dns_name=None, extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': u'c91dcbfd-939d-4a10-a9c9-895af8483641', u'ip_address': u'172.20.1.61'}], id=1b7ca842-e052-4e03-a9af-59a1e30b91e0, mac_address=fa:16:3e:00:4a:88, name=, network_id=613ef475-1de3-40ee-a8b2-4086855348f4, port_security_enabled=True, qos_policy_id=None, security_groups=[u'00517d56-036b-40ad-8ef2-ece33d0b126a'], status=DOWN, tenant_id=820ba1b0db9645ae9411700292150957, updated_at=2016-10-25T08:28:33
2016-10-25 08:53:02.959 16129 INFO neutron.agent.dhcp.agent [req-d2b62fad-6f4f-4d22-83e5-ee20037bb440 06545f81aa6b4907a18db94c8cbbeab7 8ad128085c77427585de280439c4a283 - - -] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[], binding:host_id=overcloud-controller-0.localdomain, binding:profile=, binding:vif_details=ovs_hybrid_plug=True, port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, created_at=2016-10-25T08:53:02, description=, device_id=0a4b139b-788f-44e5-acd6-a5529e08383f, device_owner=compute:None, dns_name=None, extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': u'c91dcbfd-939d-4a10-a9c9-895af8483641', u'ip_address': u'172.20.1.64'}], id=b58e8aaf-233b-47b0-8c15-22e35b6492f5, mac_address=fa:16:3e:d0:39:c2, name=, network_id=613ef475-1de3-40ee-a8b2-4086855348f4, port_security_enabled=True, qos_policy_id=None, security_groups=[u'00517d56-036b-40ad-8ef2-ece33d0b126a'], status=DOWN, tenant_id=820ba1b0db9645ae9411700292150957, updated_at=2016-10-25T08:53:02

Comment 4 Red Hat Bugzilla Rules Engine 2017-06-04 02:40:26 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 5 Jeremy 2017-06-29 14:18:35 UTC
Hello,
Seem to have another customer on OSP9 hitting this issue. Here is a better idea of how the problem happens:

While booting 60 concurrent tenant instances we see something like the following:

Some instances will fail on attempted build (overloading glance/swift?). The build will be retried according to the retry scheduler; however before the instances initial neutron port is released another is allocated. Both neutron ports remain associated with the rescheduled instance and the now successfully built instance has two IPs.

This can can lead to a situation where you run out of IPs in the range during the multiple instance boot process if your range is close to the size of your concurrent boot. 

Instances that are still trying to build / reschedule then get an error because you have no more IPs remaining in your range.

Comment 11 melanie witt 2017-07-18 18:15:22 UTC
The upstream bug [1] was solved by this patch in Ocata [2] and backported to Newton [3], so I think we just need to backport [3] to OSP 9.

[1] https://bugs.launchpad.net/nova/+bug/1609526
[2] https://review.openstack.org/#/c/393805
[3] https://review.openstack.org/#/c/396782

Comment 13 Artom Lifshitz 2018-08-01 15:47:04 UTC
This thing where an instance gets rescheduled and gets two IPs has been reported before [1] (though that ended up getting closed EOL because it's OSP11). I'm not sure it's still happening in master, but definitely worth looking into.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1469780