Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2273948

Summary: [16.2] Ephemeral heat fails resolving rabbitmq without using fqdn
Product: Red Hat OpenStack Reporter: Martin Schuppert <mschuppe>
Component: osp-director-operator-containerAssignee: OSP Team <rhos-maint>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 17.1 (Wallaby)CC: rhos-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2273947 Environment:
Last Closed: 2024-04-08 10:26:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2273947    
Bug Blocks:    

Description Martin Schuppert 2024-04-08 10:21:27 UTC
+++ This bug was initially created as a clone of Bug #2273947 +++

Description of problem:

when ephemeral heat gets a DNS REFUSED, it stops resolution, and it doesn't continue with using a search domain.
One way to get around this would be by using the actual full DNS name of the service: rabbitmq-default.openstack.svc.cluster.local. to the messaging transport_url (i.e. add the .cluster.local to make it fully qualified, and not depend on recursive resolution to return NXDOMAIN)

Version-Release number of selected component (if applicable):
17.1

--- Additional comment from Martin Schuppert on 2024-04-08 10:17:24 UTC ---

some more details:

logs from heat when seen the issue:
~~~
2024-03-19 12:05:15.963 98 ERROR oslo.messaging._drivers.impl_rabbit [req-dc61c685-e410-4453-8ae9-639605d7882c admin admin - - -] Connection failed: [Errno -2] Name or service not known (retrying in 1.0 seconds): socket.gaierror: [Errno -2] Name or service not known
2024-03-19 12:05:16.977 98 ERROR oslo.messaging._drivers.impl_rabbit [req-dc61c685-e410-4453-8ae9-639605d7882c admin admin - - -] Connection failed: [Errno -2] Name or service not known (retrying in 3.0 seconds): socket.gaierror: [Errno -2] Name or service not known
~~~

using TCPdump we see
~~~
20:47:29.197760 IP 10.131.1.95.41893 > 172.30.0.10.53: 3477+ AAAA? rabbitmq-default.openstack.svc. (48)
20:47:29.202793 IP 172.30.0.10.53 > 10.131.1.95.41893: 3477 Refused- 0/0/0 (48)
20:47:29.203645 IP 10.131.1.95.54084 > 172.30.0.10.53: 44480+ A? rabbitmq-default.openstack.svc. (48)
20:47:29.204775 IP 172.30.0.10.53 > 10.131.1.95.54084: 44480 Refused- 0/0/0 (48)
20:47:32.720715 IP 10.131.1.95.52489 > 172.30.0.10.53: 41196+ AAAA? rabbitmq-default.openstack.svc. (48)
20:47:32.721715 IP 172.30.0.10.53 > 10.131.1.95.52489: 41196 Refused- 0/0/0 (48)
20:47:32.722360 IP 10.131.1.95.50224 > 172.30.0.10.53: 33821+ A? rabbitmq-default.openstack.svc. (48)
20:47:32.723457 IP 172.30.0.10.53 > 10.131.1.95.50224: 33821 Refused- 0/0/0 (48)
~~~


/etc/resolv.conf is correct configured with all search domains. Also manually testing works
~~~
sh-5.1$ curl -vv http://rabbitmq-default.openstack.svc:5672/
*   Trying 172.30.237.99:5672...
* Connected to rabbitmq-default.openstack.svc (172.30.237.99) port 5672 (#0)
> GET / HTTP/1.1
> Host: rabbitmq-default.openstack.svc:5672
> User-Agent: curl/7.76.1
> Accept: */*
>
* Received HTTP/0.9 when not allowed
* Closing connection 0
curl: (1) Received HTTP/0.9 when not allowed
~~~

Comment 1 Martin Schuppert 2024-04-08 10:26:53 UTC

*** This bug has been marked as a duplicate of bug 2273950 ***