1469780 – Instance boots with two IPs although only one net-id is supplied

Bug 1469780 - Instance boots with two IPs although only one net-id is supplied

Summary: Instance boots with two IPs although only one net-id is supplied

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	11.0 (Ocata)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	11.0 (Ocata)
Assignee:	Artom Lifshitz
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:	scale_lab, aos-scalability-36
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-11 20:05 UTC by Sai Sindhur Malleni
Modified:	2023-03-21 18:43 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-06-22 12:44:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1390199	medium	CLOSED	Launching instance sometimes results in having 2 IP's from the same network.	2023-03-21 18:38:40 UTC
Red Hat Bugzilla	1516566	unspecified	CLOSED	2 IP addresses are assigned to instance launched after rebooting all the nodes in the setup.	2021-02-22 00:41:40 UTC
Red Hat Issue Tracker	OSP-4657	None	None	None	2022-08-16 12:50:48 UTC

Internal Links: 1390199 1516566

Description Sai Sindhur Malleni 2017-07-11 20:05:58 UTC

Description of problem: An instance is created using the command openstack server create --image 2017-06-21-ocp-3.6.121-atomic --flavor m1.small --nic net-id=85caaf07-bf6f-415a-a616-44e0325baad8 svt-i-1. The instance is expected to have only one IP annd one NIC. However, the instance boots with two IPs and two NICs(eth0 and eth1), however eth1 wasn;t even setup (had no entry in /etc/sysconfig/network-scripts/)

On doing an openstack server list, we see the VM with two ips on the same subnet

| 743427ab-af57-4704-a9d1-1d02f5c9ed3a | svt-ng-23  | ACTIVE  | private=172.16.1.225, 172.16.1.222 | 2017-06-28-ocp-3.6.126.1-2-rhel |

The instance id is 743427ab-af57-4704-a9d1-1d02f5c9ed3a
Both neutron ports are active. The port ids of the neutron ports are 
172.16.1.225- 9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c
172.16.1.222- 380883d7-e9c2-4bb0-9c11-a96ae94a0d01

The instance is only reachable via the IP 172.16.1.225.

Grepping with the instance ID and port ids in the neutron and nova logs on the 3 controllers we have the following: https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad

On doing an ip a on the instance we see https://gist.github.com/smalleni/2214f9ca06e22f4e4ad31a9f2cf5702f



Version-Release number of selected component (if applicable):
11 GA

How reproducible:
Happened on two instances in this environment

Steps to Reproduce:
1. Boot an instance, passing only one net-id
2.
3.

Actual results:
Instance should boot with one IP

Expected results:
Instance boots with two neutron ports and IPs

Additional info:

Comment 1 Jakub Libosvar 2017-07-13 12:42:28 UTC

Can you please attach Nova and Neutron logs from all controllers?

Comment 2 Sai Sindhur Malleni 2017-07-13 13:10:44 UTC

I grepped for instance id, and the two port ids in nova/neutron logs and pasted it here- https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad

Do you need the full logs? There's a lot of them.

Comment 3 Jakub Libosvar 2017-07-13 16:15:59 UTC

From the provided logs it seems Neutron is asked twice for the port, we have following ports:
9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c created at: 2017-07-10T22:52:46Z
380883d7-e9c2-4bb0-9c11-a96ae94a0d01 created at: 2017-07-10T22:53:22Z

I can see for each port a POST request:

2017-07-10 22:52:47.453 [c0] 129603 INFO neutron.wsgi [req-26a883fa-f597-4cc5-beac-1334c9ad61a1 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:52:47] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.111623
2017-07-10 22:53:23.607 [c2] 490233 INFO neutron.wsgi [req-fa3834ea-d3e6-4012-b213-ca991aa9cae2 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:53:23] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.049874


So either nova or haproxy sends the request twice.

Comment 4 Artom Lifshitz 2017-07-13 18:26:13 UTC

I took a look at the env, debug wasn't universally enabled to it was of limited use. I'll try to use the sosreports from bz 1390199 to figure it out. Comment 5 in that bz [1] is suggesting that this happens if an instance is rescheduled - this would make it the same as upstream bug 1609526 [2]. If that's indeed what's happening we'll move the bug to opentack-nova and take care of the fix.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1390199#c5
[2] https://bugs.launchpad.net/nova/+bug/1609526

Comment 5 Jakub Libosvar 2017-07-24 11:54:15 UTC

I'm flipping this to nova to triage as the symptoms are the same as in upstream bug from comment 4 - https://bugs.launchpad.net/nova/+bug/1639230

But it should be fixed in Newton while this is OSP 11 (Ocata).

Comment 15 Irina Petrova 2018-03-01 08:29:25 UTC

Artom,

On 1): we got confirmation on the problem. Intel® Virtualization Technology for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they could load the kvm_intel kernel module and now everything works as expected.

On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is not what I would expect if virtualization is disabled on HW level. Maybe the easiest way to workaround this is for nova-scheduler to perform a quick check if virtualization is enabled. If not, just to discard the node.

Anyway, I don't work on Nova. Just seems logical and the simplest way to avoid unexpected behavior (such as this).

Comment 16 Pablo Iranzo Gómez 2018-03-13 12:41:57 UTC

Citellus check for 1st issue: https://review.gerrithub.io/403670

~~~
#####################################################################
# Check with Citellus

cd

git clone https://github.com/zerodayz/citellus/
~/citellus/citellus.py $SOSREPORT

#####################################################################
~~~

Comment 17 Martin Schuppert 2018-03-13 13:02:51 UTC

(In reply to Irina Petrova from comment #15)
> Artom,
> 
> On 1): we got confirmation on the problem. Intel® Virtualization Technology
> for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they
> could load the kvm_intel kernel module and now everything works as expected.

would be good if nova-compute performs a check on startup to verify if 
virtualization is enabled based on the cpu flags. If not nova-compute start
should either fail or at least log it in the compute log.

@Aartom, does that sound like a valid RFE, should we file one?

> On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is
> not what I would expect if virtualization is disabled on HW level. Maybe the
> easiest way to workaround this is for nova-scheduler to perform a quick
> check if virtualization is enabled. If not, just to discard the node.

nova scheduler could only check this if nova compute reports it in some way,
but I think the question is if we could we end up in this situation if
the instance start fails with some other issue on the compute and gets 
re-scheduled.

Comment 20 Scott Lewis 2018-06-22 12:44:48 UTC

OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828

Note You need to log in before you can comment on or make changes to this bug.

alifshit
amuller
bdoran
berrange
chrisw
dasmith
eglynn
ipetrova
jlibosva
kchamart
mschuppe
nyechiel
ojanas
pablo.iranzo
sbauza
sferdjao
sgordon
smalleni
srevivo
vromanso