Bug 1469780 - Instance boots with two IPs although only one net-id is supplied
Summary: Instance boots with two IPs although only one net-id is supplied
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 11.0 (Ocata)
Assignee: Artom Lifshitz
QA Contact: OSP DFG:Compute
URL:
Whiteboard: scale_lab, aos-scalability-36
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-11 20:05 UTC by Sai Sindhur Malleni
Modified: 2023-03-21 18:43 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-22 12:44:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1390199 0 medium CLOSED Launching instance sometimes results in having 2 IP's from the same network. 2023-03-21 18:38:40 UTC
Red Hat Bugzilla 1516566 0 unspecified CLOSED 2 IP addresses are assigned to instance launched after rebooting all the nodes in the setup. 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker OSP-4657 0 None None None 2022-08-16 12:50:48 UTC

Internal Links: 1390199 1516566

Description Sai Sindhur Malleni 2017-07-11 20:05:58 UTC
Description of problem: An instance is created using the command openstack server create --image 2017-06-21-ocp-3.6.121-atomic --flavor m1.small --nic net-id=85caaf07-bf6f-415a-a616-44e0325baad8 svt-i-1. The instance is expected to have only one IP annd one NIC. However, the instance boots with two IPs and two NICs(eth0 and eth1), however eth1 wasn;t even setup (had no entry in /etc/sysconfig/network-scripts/)

On doing an openstack server list, we see the VM with two ips on the same subnet

| 743427ab-af57-4704-a9d1-1d02f5c9ed3a | svt-ng-23  | ACTIVE  | private=172.16.1.225, 172.16.1.222 | 2017-06-28-ocp-3.6.126.1-2-rhel |

The instance id is 743427ab-af57-4704-a9d1-1d02f5c9ed3a
Both neutron ports are active. The port ids of the neutron ports are 
172.16.1.225- 9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c
172.16.1.222- 380883d7-e9c2-4bb0-9c11-a96ae94a0d01

The instance is only reachable via the IP 172.16.1.225.

Grepping with the instance ID and port ids in the neutron and nova logs on the 3 controllers we have the following: https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad

On doing an ip a on the instance we see https://gist.github.com/smalleni/2214f9ca06e22f4e4ad31a9f2cf5702f



Version-Release number of selected component (if applicable):
11 GA

How reproducible:
Happened on two instances in this environment

Steps to Reproduce:
1. Boot an instance, passing only one net-id
2.
3.

Actual results:
Instance should boot with one IP

Expected results:
Instance boots with two neutron ports and IPs

Additional info:

Comment 1 Jakub Libosvar 2017-07-13 12:42:28 UTC
Can you please attach Nova and Neutron logs from all controllers?

Comment 2 Sai Sindhur Malleni 2017-07-13 13:10:44 UTC
I grepped for instance id, and the two port ids in nova/neutron logs and pasted it here- https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad

Do you need the full logs? There's a lot of them.

Comment 3 Jakub Libosvar 2017-07-13 16:15:59 UTC
From the provided logs it seems Neutron is asked twice for the port, we have following ports:
9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c created at: 2017-07-10T22:52:46Z
380883d7-e9c2-4bb0-9c11-a96ae94a0d01 created at: 2017-07-10T22:53:22Z

I can see for each port a POST request:

2017-07-10 22:52:47.453 [c0] 129603 INFO neutron.wsgi [req-26a883fa-f597-4cc5-beac-1334c9ad61a1 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:52:47] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.111623
2017-07-10 22:53:23.607 [c2] 490233 INFO neutron.wsgi [req-fa3834ea-d3e6-4012-b213-ca991aa9cae2 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:53:23] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.049874


So either nova or haproxy sends the request twice.

Comment 4 Artom Lifshitz 2017-07-13 18:26:13 UTC
I took a look at the env, debug wasn't universally enabled to it was of limited use. I'll try to use the sosreports from bz 1390199 to figure it out. Comment 5 in that bz [1] is suggesting that this happens if an instance is rescheduled - this would make it the same as upstream bug 1609526 [2]. If that's indeed what's happening we'll move the bug to opentack-nova and take care of the fix.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1390199#c5
[2] https://bugs.launchpad.net/nova/+bug/1609526

Comment 5 Jakub Libosvar 2017-07-24 11:54:15 UTC
I'm flipping this to nova to triage as the symptoms are the same as in upstream bug from comment 4 - https://bugs.launchpad.net/nova/+bug/1639230

But it should be fixed in Newton while this is OSP 11 (Ocata).

Comment 15 Irina Petrova 2018-03-01 08:29:25 UTC
Artom,

On 1): we got confirmation on the problem. Intel® Virtualization Technology for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they could load the kvm_intel kernel module and now everything works as expected.

On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is not what I would expect if virtualization is disabled on HW level. Maybe the easiest way to workaround this is for nova-scheduler to perform a quick check if virtualization is enabled. If not, just to discard the node.

Anyway, I don't work on Nova. Just seems logical and the simplest way to avoid unexpected behavior (such as this).

Comment 16 Pablo Iranzo Gómez 2018-03-13 12:41:57 UTC
Citellus check for 1st issue: https://review.gerrithub.io/403670

~~~
#####################################################################
# Check with Citellus

cd

git clone https://github.com/zerodayz/citellus/
~/citellus/citellus.py $SOSREPORT

#####################################################################
~~~

Comment 17 Martin Schuppert 2018-03-13 13:02:51 UTC
(In reply to Irina Petrova from comment #15)
> Artom,
> 
> On 1): we got confirmation on the problem. Intel® Virtualization Technology
> for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they
> could load the kvm_intel kernel module and now everything works as expected.

would be good if nova-compute performs a check on startup to verify if 
virtualization is enabled based on the cpu flags. If not nova-compute start
should either fail or at least log it in the compute log.

@Aartom, does that sound like a valid RFE, should we file one?

> On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is
> not what I would expect if virtualization is disabled on HW level. Maybe the
> easiest way to workaround this is for nova-scheduler to perform a quick
> check if virtualization is enabled. If not, just to discard the node.

nova scheduler could only check this if nova compute reports it in some way,
but I think the question is if we could we end up in this situation if
the instance start fails with some other issue on the compute and gets 
re-scheduled.

Comment 20 Scott Lewis 2018-06-22 12:44:48 UTC
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828


Note You need to log in before you can comment on or make changes to this bug.