Bug 1758688
Summary: | [OSP16] neutron-dhcp fail to spawn DHCP process for networks on the undercloud with Spine-Leaf network topology | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yuri Obshansky <yobshans> |
Component: | openstack-neutron | Assignee: | Rodolfo Alonso <ralonsoh> |
Status: | CLOSED ERRATA | QA Contact: | Eran Kuris <ekuris> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 16.0 (Train) | CC: | amuller, bdobreli, bfournie, ccamposr, chrisw, dasmith, dsneddon, eglynn, hjensas, jhakimra, jlibosva, jschluet, kchamart, ralonsoh, sbauza, scohen, sgordon, vromanso, yobshans |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 16.0 (Train on RHEL 8.1) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-neutron-15.0.1-0.20191126044254.d18d02e.el8ost | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-02-06 14:42:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yuri Obshansky
2019-10-04 21:00:44 UTC
When looking at the new reproducer I noticed that the remote compute nodes did not boot the provisioning ramdisk. No DHCP offer recived. In messages log I saw: Oct 8 17:01:05 site-undercloud-0 dnsmasq-dhcp[64563]: no address range available for DHCP request via 192.168.44.254 Oct 8 17:01:10 site-undercloud-0 dnsmasq-dhcp[64563]: no address range available for DHCP request via 192.168.34.254 After restarting neutron dnsmasq service and reseting the baremetal nodes they booted the ramdisk and the deployment continues. Ports are created on the isolated networks: (undercloud) [stack@site-undercloud-0 ~]$ openstack port list | egrep 'overcloud-compute2|overcloud-compute1' | 2d5f8c07-0fc2-40dc-957a-721a1e912def | overcloud-compute1-0_Storage | fa:16:3e:66:df:3e | ip_address='172.23.2.69', subnet_id='b6dc8a40-687c-4ab6-be79-e1a431e1c803' | DOWN | | 2dcac122-e156-43d1-b796-c826125c91cc | overcloud-compute1-0_Tenant | fa:16:3e:4d:16:68 | ip_address='172.19.2.118', subnet_id='0ed7da69-6746-4a5b-b6f0-1c3366959360' | DOWN | | 65375d96-41be-4fb2-998b-90371534f2e6 | overcloud-compute1-0_InternalApi | fa:16:3e:fd:11:d7 | ip_address='172.25.2.19', subnet_id='543eb45f-ea9f-4444-ac8f-5064d83a30bf' | DOWN | | 81dd2498-b808-411c-9498-1b3b27663658 | overcloud-compute2-0_Tenant | fa:16:3e:26:47:9d | ip_address='172.19.3.16', subnet_id='915e45f9-d0f4-428b-926e-dc9da880c5c4' | DOWN | | a06256e2-8c0b-4da1-8961-ffd6ca996498 | overcloud-compute2-0_Storage | fa:16:3e:73:ea:09 | ip_address='172.23.3.246', subnet_id='06da4816-81c1-49b8-b7b9-6b82a25acc98' | DOWN | | f745d08a-df2c-4d94-9685-ab144ab78e50 | overcloud-compute2-0_InternalApi | fa:16:3e:46:9a:f1 | ip_address='172.25.3.176', subnet_id='8e974b8e-e0c7-4a8b-8392-b819d764bd9f' | DOWN | Could be a sync issue in neutron dhcp agent. Once the deployment is done we plan to add another subnet on the ctlplane network and check dhcp agent log's and dnsmasq config to see if it syncs as expected. [root@site-undercloud-0 containers]# ps aux | grep dnsmasq | grep dumb-init | grep neutron | grep 'dhcp-range' root 370937 0.0 0.0 4204 796 ? Ss 17:07 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/host --addn-hosts=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/opts --dhcp-leasefile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.34.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=768 --conf-file= --domain=localdomain 3 Ranges above: --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.34.0,static,255.255.255.0,86400s (undercloud) [stack@site-undercloud-0 ~]$ openstack network segment create test-segment \ --network ctlplane \ --network-type flat \ --physical-network test-network +------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | description | | | id | 9830d2f7-efec-4616-ac87-46fdc98bf6f9 | | location | cloud='', project.domain_id=, project.domain_name='Default', project.id='72ca0a268d294bd0ac01fc30e3e0dc33', project.name='admin', region_name='', zone= | | name | test-segment | | network_id | e25eae7d-1c1c-4017-b66b-899fb17f025e | | network_type | flat | | physical_network | test-network | | segmentation_id | None | +------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ (undercloud) [stack@site-undercloud-0 ~]$ openstack subnet create \ --network ctlplane \ --network-segment 9830d2f7-efec-4616-ac87-46fdc98bf6f9 \ --subnet-range 192.168.54.0/24 \ --dhcp \ --gateway 192.168.54.254 \ --allocation-pool start=192.168.54.10,end=192.168.54.100 \ --dns-nameserver 192.168.54.254 test-subnet +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | allocation_pools | 192.168.54.10-192.168.54.100 | | cidr | 192.168.54.0/24 | | created_at | 2019-10-08T18:12:12Z | | description | | | dns_nameservers | 192.168.54.254 | | enable_dhcp | True | | gateway_ip | 192.168.54.254 | | host_routes | destination='192.168.24.0/24', gateway='192.168.54.254' | | | destination='192.168.34.0/24', gateway='192.168.54.254' | | | destination='192.168.44.0/24', gateway='192.168.54.254' | | id | 21f8b34c-2e8c-4c73-acbb-7acaa012e0b3 | | ip_version | 4 | | ipv6_address_mode | None | | ipv6_ra_mode | None | | location | cloud='', project.domain_id=, project.domain_name='Default', project.id='72ca0a268d294bd0ac01fc30e3e0dc33', project.name='admin', region_name='', zone= | | name | test-subnet | | network_id | e25eae7d-1c1c-4017-b66b-899fb17f025e | | prefix_length | None | | project_id | 72ca0a268d294bd0ac01fc30e3e0dc33 | | revision_number | 0 | | segment_id | 9830d2f7-efec-4616-ac87-46fdc98bf6f9 | | service_types | | | subnetpool_id | None | | tags | | | updated_at | 2019-10-08T18:12:12Z | +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ [root@site-undercloud-0 containers]# ps aux | grep dnsmasq | grep dumb-init | grep neutron | grep 'dhcp-range' root 477344 0.1 0.0 4204 788 ? Ss 18:12 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/host --addn-hosts=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/opts --dhcp-leasefile=/var/lib/neutron/dhcp/e25eae7d-1c1c-4017-b66b-899fb17f025e/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.54.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag3,192.168.34.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=1024 --conf-file= --domain=localdomain The agent syncs as expected: --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.54.0,static,255.255.255.0,86400s <-- New subnet --dhcp-range=set:tag2,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag3,192.168.34.0,static,255.255.255.0,86400s Lot's of these in dhcp agent log: 2019-10-08 18:12:15.517 370514 ERROR neutron.agent.dhcp.agent error creating container storage: the container name "neutron-dnsmasq-qdhcp-e25eae7d-1c1c-4017-b66b-899fb17f025e" is already in use by "37c8238f718a05caf4b70e42b8f9b1b6c28cb9dcff7f97a40a733bd949855d02". You have to remove that container to be able to reuse that name.: that name is already in use After a redeploy of the undercloud - the ranges look fine. [stack@site-undercloud-0 ~]$ sudo ps aux | grep dnsmasq | grep dumb-init | grep neutron | grep 'dhcp-range' root 65071 0.0 0.0 4204 800 ? Ss 19:56 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/aa7bfe58-55f1-41e8-99f2-198bc7a8feab/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/aa7bfe58-55f1-41e8-99f2-198bc7a8feab/host --addn-hosts=/var/lib/neutron/dhcp/aa7bfe58-55f1-41e8-99f2-198bc7a8feab/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/aa7bfe58-55f1-41e8-99f2-198bc7a8feab/opts --dhcp-leasefile=/var/lib/neutron/dhcp/aa7bfe58-55f1-41e8-99f2-198bc7a8feab/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.34.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=768 --conf-file= --domain=localdomain --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.34.0,static,255.255.255.0,86400s It still reproducible. RHOS_TRUNK-16.0-RHEL-8-20191007.n.0 dhcp-libs-4.3.6-34.el8.x86_64 dhcp-client-4.3.6-34.el8.x86_64 dhcp-common-4.3.6-34.el8.noarch python3-neutronclient-6.14.0-0.20190919181709.115f60f.el8ost.noarch puppet-neutron-15.4.1-0.20191004213143.f76f779.el8ost.noarch python3-neutron-lib-1.29.1-0.20190923154030.4ef4b71.el8ost.noarch And it definitely related to Spine-Leaf network topology Right straight Undercloud installation we have: (undercloud) [stack@site-undercloud-0 neutron]$ ps aux | grep dnsmasq | grep dumb-init | grep neutron | grep 'dhcp-range' root 66559 0.0 0.0 4204 756 ? Ss 21:17 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/host --addn-hosts=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/opts --dhcp-leasefile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain So, there are not in dhcp-range 192.168.34.0 and 192.168.44.0 Found in dhcp-agent.log 2019-10-10 21:17:43.623 66326 DEBUG oslo.privsep.daemon [-] privsep: Exception during request[140001026244384]: Network interface tapeb1409d5-77 not found in namespace qdhcp-a3118545-bfee-4b4a-8064-1eb965be7124. _process_cmd /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:454 Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 246, in get_link_id return ip.link_lookup(ifname=device)[0] IndexError: list index out of range During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 449, in _process_cmd ret = func(*f_args, **f_kwargs) File "/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 53, in sync_inner return input_func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 247, in _wrap return func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 406, in get_link_attributes link = _run_iproute_link("get", device, namespace)[0] File "/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 254, in _run_iproute_link idx = get_link_id(device, namespace) File "/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 248, in get_link_id raise NetworkInterfaceNotFound(device=device, namespace=namespace) neutron.privileged.agent.linux.ip_lib.NetworkInterfaceNotFound: Network interface tapeb1409d5-77 not found in namespace qdhcp-a3118545-bfee-4b4a-8064-1eb965be7124. 2019-10-10 21:17:43.624 66326 DEBUG oslo.privsep.daemon [-] privsep: reply[140001026244384]: (5, 'neutron.privileged.agent.linux.ip_lib.NetworkInterfaceNotFound', ('Network interface tapeb1409d5-77 not found in namespace qdhcp-a3118545-bfee-4b4a-8064-1eb965be7124.',)) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:475 While interface tapeb1409d5-77 is existed as well (undercloud) [stack@site-undercloud-0 neutron]$ sudo ovs-vsctl list-ports br-int int-br-ctlplane tapeb1409d5-77 Probably, this is issue https://bugs.launchpad.net/tripleo/+bug/1799484 Most likely this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1732980 per the missing dhcp-range settings. As a workaround we can run after UC installation $ sudo setenforce permissive $ sudo podman restart neutron_dhcp (undercloud) [stack@site-undercloud-0 ~]$ ps aux | grep dnsmasq | grep dumb-init | grep neutron | grep 'dhcp-range' root 555146 0.0 0.0 4204 788 ? Ss 13:26 0:00 dumb-init --single-child -- /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/host --addn-hosts=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/opts --dhcp-leasefile=/var/lib/neutron/dhcp/a3118545-bfee-4b4a-8064-1eb965be7124/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-range=set:tag1,192.168.44.0,static,255.255.255.0,86400s --dhcp-range=set:tag2,192.168.34.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=768 --conf-file= --domain=localdomain (undercloud) [stack@site-undercloud-0 ~]$ openstack port list |grep compute1 | 9fa796ef-ff58-4612-a241-76f8bcca8016 | overcloud-compute1-0_Tenant | fa:16:3e:0b:be:d4 | ip_address='172.19.2.116', subnet_id='894660e7-5615-4014-8f3f-6490f3e4030b' | DOWN | | bbed1e4b-dc77-4c7e-8f87-3a17f3ba7115 | overcloud-compute1-0_Storage | fa:16:3e:99:d3:4a | ip_address='172.23.2.158', subnet_id='2c9aa9fd-938e-4ed6-b50b-c3ea579a37b7' | DOWN | | eae958e5-6724-430f-9173-36fcd5410fd4 | overcloud-compute1-0_InternalApi | fa:16:3e:c3:9d:27 | ip_address='172.25.2.145', subnet_id='3ebef40a-e0f0-462e-badd-3150acf88e63' | DOWN | (undercloud) [stack@site-undercloud-0 ~]$ openstack port list |grep compute2 | 857b7842-45dc-4779-a983-3b06b8095299 | overcloud-compute2-0_Tenant | fa:16:3e:f9:31:f0 | ip_address='172.19.3.92', subnet_id='6b3d8e10-1da7-4526-b3ef-c828987bcbb0' | DOWN | | a3e48c30-6510-4109-9905-e47e995f0386 | overcloud-compute2-0_Storage | fa:16:3e:76:3f:f1 | ip_address='172.23.3.249', subnet_id='7b553b4b-218c-41f1-8378-57f984bea5e1' | DOWN | | fcbb80c6-51f0-4d4e-8677-ea35f341375c | overcloud-compute2-0_InternalApi | fa:16:3e:0b:39:42 | ip_address='172.25.3.8', subnet_id='ccc4851b-d9bd-432a-bf2f-bb3d738b50cc' | DOWN | Anyway, that is a bug and have to be fixed. See also https://bugs.launchpad.net/neutron/+bug/1848738, seems like multiple bugs are tracking this same issue. Rodolfo, I was going to make this a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1732980 unless you want to keep this one open. Removing HardProv as looks like neutron-related issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:0283 |