1699991 – Instance is failing to spawn if its IP from tenant network also assigned to Compute node

Bug 1699991 - Instance is failing to spawn if its IP from tenant network also assigned to Compute node

Summary: Instance is failing to spawn if its IP from tenant network also assigned to C...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-os-vif
Sub Component:
Version:	13.0 (Queens)
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	z7
Target Release:	13.0 (Queens)
Assignee:	Rodolfo Alonso
QA Contact:	Candido Campos
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1709366
TreeView+	depends on / blocked

Reported:	2019-04-15 14:29 UTC by Alex Stupnikov
Modified:	2022-08-09 15:08 UTC (History)
CC List:	10 users (show)
Fixed In Version:	python2-os-vif-1.9.1-3.el7ost
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-09-03 16:58:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1825888	None	None	None	2019-04-22 20:14:23 UTC
OpenStack gerrit	655694	'None'	MERGED	Prevent "qbr" Linux Bridge from replying to ARP messages	2021-02-02 05:04:50 UTC
Red Hat Issue Tracker	OSP-7945	None	None	None	2022-08-09 15:08:53 UTC
Red Hat Knowledge Base (Solution)	4133771	Troubleshoot	None	iptables_hybrid OVS firewall driver doesn't work properly	2019-05-13 13:49:39 UTC
Red Hat Product Errata	RHBA-2019:2623	None	None	None	2019-09-03 16:58:32 UTC

Description Alex Stupnikov 2019-04-15 14:29:11 UTC

Description of problem:

Let's imagine a situation when some IP address that is assigned to Compute node itself also belongs to allocation pool from tenant subnet. If user will launch an instance and that instance will get an IP address which also belongs to instance's Compute node, then VM may fail DHCP allocation process: certain VMs operating systems (like RHEL, Centos, etc) will send an ARP request to confirm that IP address obtained via DHCP is not used by other network entity; Compute will get this ARP request on tap device that is used to emulate VM's NIC and will send a reply that address is used.


Steps to reproduce:

- check Compute's IPs and networks, choose the network and IP address to test this issue. For example: 172.17.3.0/24 -> 172.17.3.29
- create neutron network with appropriate subnet, set DHCP allocation pool properly, so it will include IPs for DHCP agents and the VM itself (or assign fixed IP)
- schedule Centos/RHEL VM on proper compute node


Expected result: VM able to connect to network, since tenant network has nothing to do with Compute node

Actual result: VM will fail to get DHCP lease and will not be able to connect to network.


Tcpdump:


[root@compute-0 ~]# tcpdump -i qbrbd736ed8-6a
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on qbrbd736ed8-6a, link-type EN10MB (Ethernet), capture size 262144 bytes
13:15:48.474271 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
13:15:48.662992 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:1e:a0:6e (oui Unknown), length 300
13:15:48.665387 IP 172.17.3.27.bootps > compute-0.storage.localdomain.bootpc: BOOTP/DHCP, Reply, length 338
13:15:48.666152 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:1e:a0:6e (oui Unknown), length 300
13:15:48.668467 IP 172.17.3.26.bootps > compute-0.storage.localdomain.bootpc: BOOTP/DHCP, Reply, length 338
13:15:48.669480 IP 172.17.3.27.bootps > compute-0.storage.localdomain.bootpc: BOOTP/DHCP, Reply, length 338
13:15:48.742143 ARP, Request who-has compute-0.storage.localdomain (Broadcast) tell 0.0.0.0, length 28
13:15:48.742171 ARP, Reply compute-0.storage.localdomain is-at 66:5a:44:9e:90:1f (oui Unknown), length 28
13:15:48.755438 ARP, Request who-has compute-0.storage.localdomain (Broadcast) tell 0.0.0.0, length 28
13:15:48.755463 ARP, Reply compute-0.storage.localdomain is-at 66:5a:44:9e:90:1f (oui Unknown), length 28

[root@compute-0 ~]# ip link show qbrbd736ed8-6a
21: qbrbd736ed8-6a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 66:5a:44:9e:90:1f brd ff:ff:ff:ff:ff:ff



Workaround:

    -A INPUT -p ARP -i tapbd736ed8-6a -j DROP
    -A OUTPUT -p ARP -o tapbd736ed8-6a -j DROP


Additional info: I am not sure if this issue should be investigated by Nova or Neutron squad. Please re-assign if needed.

Comment 1 Alex Stupnikov 2019-04-15 14:29:50 UTC

Workaround:

[root@compute-0 ~]# ebtables-save 
# Generated by ebtables-save v1.0 on Mon Apr 15 13:53:30 UTC 2019
*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT
-A INPUT -p ARP -i tapbd736ed8-6a -j DROP
-A OUTPUT -p ARP -o tapbd736ed8-6a -j DROP

Comment 2 Alex Stupnikov 2019-04-15 15:41:41 UTC

Docker images:

  DockerNeutronApiImage: 192.168.24.1:8787/rhosp13/openstack-neutron-server:2019-02-24.1
  DockerNeutronConfigImage: 192.168.24.1:8787/rhosp13/openstack-neutron-server:2019-02-24.1
  DockerNeutronDHCPImage: 192.168.24.1:8787/rhosp13/openstack-neutron-dhcp-agent:2019-02-24.1
  DockerNeutronL3AgentImage: 192.168.24.1:8787/rhosp13/openstack-neutron-l3-agent:2019-02-24.1
  DockerNeutronMetadataImage: 192.168.24.1:8787/rhosp13/openstack-neutron-metadata-agent:2019-02-24.1
  DockerOpenvswitchImage: 192.168.24.1:8787/rhosp13/openstack-neutron-openvswitch-agent:2019-02-24.1

  DockerNovaApiImage: 192.168.24.1:8787/rhosp13/openstack-nova-api:2019-02-24.1
  DockerNovaComputeImage: 192.168.24.1:8787/rhosp13/openstack-nova-compute:2019-02-24.1
  DockerNovaConductorImage: 192.168.24.1:8787/rhosp13/openstack-nova-conductor:2019-02-24.1
  DockerNovaConfigImage: 192.168.24.1:8787/rhosp13/openstack-nova-api:2019-02-24.1
  DockerNovaConsoleauthImage: 192.168.24.1:8787/rhosp13/openstack-nova-consoleauth:2019-02-24.1
  DockerNovaLibvirtConfigImage: 192.168.24.1:8787/rhosp13/openstack-nova-compute:2019-02-24.1
  DockerNovaLibvirtImage: 192.168.24.1:8787/rhosp13/openstack-nova-libvirt:2019-02-24.1
  DockerNovaMetadataImage: 192.168.24.1:8787/rhosp13/openstack-nova-api:2019-02-24.1
  DockerNovaPlacementConfigImage: 192.168.24.1:8787/rhosp13/openstack-nova-placement-api:2019-02-24.1
  DockerNovaPlacementImage: 192.168.24.1:8787/rhosp13/openstack-nova-placement-api:2019-02-24.1
  DockerNovaSchedulerImage: 192.168.24.1:8787/rhosp13/openstack-nova-scheduler:2019-02-24.1
  DockerNovaVncProxyImage: 192.168.24.1:8787/rhosp13/openstack-nova-novncproxy:2019-02-24.1

Comment 27 errata-xmlrpc 2019-09-03 16:58:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2623

Note You need to log in before you can comment on or make changes to this bug.