Description of problem: An instance is created using the command openstack server create --image 2017-06-21-ocp-3.6.121-atomic --flavor m1.small --nic net-id=85caaf07-bf6f-415a-a616-44e0325baad8 svt-i-1. The instance is expected to have only one IP annd one NIC. However, the instance boots with two IPs and two NICs(eth0 and eth1), however eth1 wasn;t even setup (had no entry in /etc/sysconfig/network-scripts/) On doing an openstack server list, we see the VM with two ips on the same subnet | 743427ab-af57-4704-a9d1-1d02f5c9ed3a | svt-ng-23 | ACTIVE | private=172.16.1.225, 172.16.1.222 | 2017-06-28-ocp-3.6.126.1-2-rhel | The instance id is 743427ab-af57-4704-a9d1-1d02f5c9ed3a Both neutron ports are active. The port ids of the neutron ports are 172.16.1.225- 9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c 172.16.1.222- 380883d7-e9c2-4bb0-9c11-a96ae94a0d01 The instance is only reachable via the IP 172.16.1.225. Grepping with the instance ID and port ids in the neutron and nova logs on the 3 controllers we have the following: https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad On doing an ip a on the instance we see https://gist.github.com/smalleni/2214f9ca06e22f4e4ad31a9f2cf5702f Version-Release number of selected component (if applicable): 11 GA How reproducible: Happened on two instances in this environment Steps to Reproduce: 1. Boot an instance, passing only one net-id 2. 3. Actual results: Instance should boot with one IP Expected results: Instance boots with two neutron ports and IPs Additional info:
Can you please attach Nova and Neutron logs from all controllers?
I grepped for instance id, and the two port ids in nova/neutron logs and pasted it here- https://gist.github.com/smalleni/3ae0792440d0276f62f9d865ce0001ad Do you need the full logs? There's a lot of them.
From the provided logs it seems Neutron is asked twice for the port, we have following ports: 9cbec257-81ac-4b4a-b0b2-bbaa0c13f26c created at: 2017-07-10T22:52:46Z 380883d7-e9c2-4bb0-9c11-a96ae94a0d01 created at: 2017-07-10T22:53:22Z I can see for each port a POST request: 2017-07-10 22:52:47.453 [c0] 129603 INFO neutron.wsgi [req-26a883fa-f597-4cc5-beac-1334c9ad61a1 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:52:47] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.111623 2017-07-10 22:53:23.607 [c2] 490233 INFO neutron.wsgi [req-fa3834ea-d3e6-4012-b213-ca991aa9cae2 accde8c8d00f4ae0a765fa7df5a20d7e 6934d26fc8f64fbdb12a80de7850b958 - - -] 192.168.25.16 - - [10/Jul/2017 22:53:23] "POST /v2.0/ports.json HTTP/1.1" 201 1047 1.049874 So either nova or haproxy sends the request twice.
I took a look at the env, debug wasn't universally enabled to it was of limited use. I'll try to use the sosreports from bz 1390199 to figure it out. Comment 5 in that bz [1] is suggesting that this happens if an instance is rescheduled - this would make it the same as upstream bug 1609526 [2]. If that's indeed what's happening we'll move the bug to opentack-nova and take care of the fix. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1390199#c5 [2] https://bugs.launchpad.net/nova/+bug/1609526
I'm flipping this to nova to triage as the symptoms are the same as in upstream bug from comment 4 - https://bugs.launchpad.net/nova/+bug/1639230 But it should be fixed in Newton while this is OSP 11 (Ocata).
Artom, On 1): we got confirmation on the problem. Intel® Virtualization Technology for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they could load the kvm_intel kernel module and now everything works as expected. On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is not what I would expect if virtualization is disabled on HW level. Maybe the easiest way to workaround this is for nova-scheduler to perform a quick check if virtualization is enabled. If not, just to discard the node. Anyway, I don't work on Nova. Just seems logical and the simplest way to avoid unexpected behavior (such as this).
Citellus check for 1st issue: https://review.gerrithub.io/403670 ~~~ ##################################################################### # Check with Citellus cd git clone https://github.com/zerodayz/citellus/ ~/citellus/citellus.py $SOSREPORT ##################################################################### ~~~
(In reply to Irina Petrova from comment #15) > Artom, > > On 1): we got confirmation on the problem. Intel® Virtualization Technology > for Directed I/O (VT-d) was *NOT* enabled in BIOS. Upon enabling it, they > could load the kvm_intel kernel module and now everything works as expected. would be good if nova-compute performs a check on startup to verify if virtualization is enabled based on the cpu flags. If not nova-compute start should either fail or at least log it in the compute log. @Aartom, does that sound like a valid RFE, should we file one? > On 2): we still have a Bug. :) Randomly creating/attaching neutron ports is > not what I would expect if virtualization is disabled on HW level. Maybe the > easiest way to workaround this is for nova-scheduler to perform a quick > check if virtualization is enabled. If not, just to discard the node. nova scheduler could only check this if nova compute reports it in some way, but I think the question is if we could we end up in this situation if the instance start fails with some other issue on the compute and gets re-scheduled.
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828