1360398 – In IPv6 deployment cluster fails when 1/2 DNS servers are unreachable

Bug 1360398 - In IPv6 deployment cluster fails when 1/2 DNS servers are unreachable

Summary: In IPv6 deployment cluster fails when 1/2 DNS servers are unreachable

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	rabbitmq-server
Sub Component:
Version:	9.0 (Mitaka)
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	z4
Target Release:	12.0 (Pike)
Assignee:	Peter Lemenkov
QA Contact:	Udi Shkalim
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-26 15:12 UTC by Marius Cornea
Modified:	2019-06-24 15:56 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-24 15:56:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
/var/log/rabbitmq/ on controller-0 (54.54 KB, application/x-gzip) 2016-07-27 12:02 UTC, Marius Cornea	no flags	Details
tcpdump + rabbitmq logs (88.12 KB, application/x-gzip) 2016-07-28 19:51 UTC, Marius Cornea	no flags	Details
View All

Description Marius Cornea 2016-07-26 15:12:34 UTC

Description of problem:
I'm simulating an unreachable DNS server in an IPv6 deployment(using IPv4 DNS servers). When first of the DNS servers in resolv.conf is unreachable and times out the cluster fails, rabbitmq resources show as stopped and all the Openstack resources end as Stopped. 

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-2.0.0-16.el7ost.noarch
rabbitmq-server-3.6.3-5.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy IPv6 overcloud with IPv4 DNS servers

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud deploy --templates \
-e $THT/environments/network-isolation-v6.yaml \
-e ~/templates/network-environment-v6.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
--control-scale 3 \
--control-flavor controller \
--compute-scale 1 \
--compute-flavor compute \
--ceph-storage-scale 1 \
--ceph-storage-flavor ceph \
--ntp-server clock.redhat.com  \
--libvirt-type qemu 

2. Make sure overcloud gets deployed successfully, all pcs resources are started.

3. Simulate failure of the 1st DNS server that's set in the resolv.conf of the overcloud nodes

[root@overcloud-controller-2 heat-admin]# cat /etc/resolv.conf
# Generated by NetworkManager
search localdomain


# No nameservers found; try putting DNS servers into your
# ifcfg files in /etc/sysconfig/network-scripts like so:
#
# DNS1=xxx.xxx.xxx.xxx
# DNS2=xxx.xxx.xxx.xxx
# DOMAIN=lab.foo.com bar.foo.com
nameserver 10.16.36.29
nameserver 10.11.5.19

On the virthost block acess to the first DNS server:
ip route add blackhol 10.16.36.29/32;

Actual results:
The cluster breaks and the Openstack services aren't accessible.

[stack@undercloud ~]$ source overcloudrc 
[stack@undercloud ~]$ nova list
No handlers could be found for logger "keystoneauth.identity.generic.base"
ERROR (ServiceUnavailable): Service Unavailable (HTTP 503)

Expected results:
The services are still accessible, one unreachable DNS servers shouldn't generate service outage.

Additional info:

Comment 3 Marius Cornea 2016-07-26 15:50:05 UTC

[root@overcloud-controller-2 heat-admin]# pcs status
Cluster name: tripleo_cluster
Last updated: Tue Jul 26 15:49:43 2016		Last change: Tue Jul 26 14:22:14 2016 by hacluster via crmd on overcloud-controller-2
Stack: corosync
Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
3 nodes and 127 resources configured

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Full list of resources:

 ip-fd00.fd00.fd00.4000..10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-2001.db8.ca2.4..10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1
 Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-192.168.0.21	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: rabbitmq-clone [rabbitmq]
     rabbitmq	(ocf::heartbeat:rabbitmq-cluster):	FAILED overcloud-controller-1 (unmanaged)
     rabbitmq	(ocf::heartbeat:rabbitmq-cluster):	FAILED overcloud-controller-2 (unmanaged)
     rabbitmq	(ocf::heartbeat:rabbitmq-cluster):	FAILED overcloud-controller-0 (unmanaged)
 Clone Set: openstack-core-clone [openstack-core]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-1 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-2 ]
 ip-fd00.fd00.fd00.3000..10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1
 ip-fd00.fd00.fd00.2000..10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.2000..11	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1
 Clone Set: mongod-clone [mongod]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 openstack-cinder-volume	(systemd:openstack-cinder-volume):	Stopped
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: httpd-clone [httpd]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Failed Actions:
* rabbitmq_stop_0 on overcloud-controller-1 'unknown error' (1): call=692, status=Timed Out, exitreason='none',
    last-rc-change='Tue Jul 26 14:48:57 2016', queued=0ms, exec=90009ms
* rabbitmq_stop_0 on overcloud-controller-2 'unknown error' (1): call=637, status=Timed Out, exitreason='none',
    last-rc-change='Tue Jul 26 14:47:27 2016', queued=0ms, exec=90001ms
* rabbitmq_stop_0 on overcloud-controller-0 'unknown error' (1): call=705, status=Timed Out, exitreason='none',
    last-rc-change='Tue Jul 26 14:45:56 2016', queued=0ms, exec=90008ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 4 Marius Cornea 2016-07-27 12:02:09 UTC

Created attachment 1184604 [details]
/var/log/rabbitmq/ on controller-0

Comment 5 Peter Lemenkov 2016-07-27 13:06:45 UTC

This isn't something related to IPv4 or IPv6. Inaccessible DNS server causes timeouts in rabbitmqctl script, so pcs believes that the resource has failed.

Also this affects connectivity between RabbitMQ nodes somehow:


=INFO REPORT==== 27-Jul-2016::11:47:29 ===
rabbit on node 'rabbit@overcloud-controller-1' down

=INFO REPORT==== 27-Jul-2016::11:47:29 ===
Keep rabbit@overcloud-controller-1 listeners: the node is already back

=INFO REPORT==== 27-Jul-2016::11:47:29 ===
Mirrored queue 'q-reports-plugin_fanout_1dd728a4a40346a685366957e5ce83ad' in vhost '/': Master <rabbit.8780.0> saw deaths of mirrors <rabbit.9482.0>

=INFO REPORT==== 27-Jul-2016::11:47:29 ===
Mirrored queue 'conductor_fanout_4b9e5610aa3a4a6985b436780322df77' in vhost '/': Slave <rabbit.15504.0> saw deaths of mirrors <rabbit.17655.0>

...

=INFO REPORT==== 27-Jul-2016::11:48:44 ===
node 'rabbit@overcloud-controller-2' down: connection_closed

=WARNING REPORT==== 27-Jul-2016::11:48:44 ===
Cluster minority/secondary status detected - awaiting recovery

Comment 6 Marius Cornea 2016-07-28 11:03:24 UTC

(In reply to Peter Lemenkov from comment #5)
> This isn't something related to IPv4 or IPv6. Inaccessible DNS server causes
> timeouts in rabbitmqctl script, so pcs believes that the resource has failed.
> 

I tried the same scenario on an IPv4 environment and couldn't reproduce this issue hence the report is only for IPv6 deployments.

Comment 8 Fabio Massimo Di Nitto 2016-07-28 14:35:30 UTC

So first of all, I'd like to understand the meaning of this test.

What are we testing exactly? what are we trying to validate?

The test, as I see it, it's checking of glibc resolver is fast enough in recognizing that a DNS is faulty and move to the next one.

It's entirely possible that what you see here is simply the resolver returning a timeout to rabbitmq (that could be racy btw and explain why you might see it in IPv6 and not in IPv4) and rabbitmq is correctly reporting the error.

The only course of action here is to fix the infrastructure.

I consider rabbitmq behavior correct in returning the error and failing vs trying to recover.

Comment 9 Marius Cornea 2016-07-28 14:56:38 UTC

(In reply to Fabio Massimo Di Nitto from comment #8)
> So first of all, I'd like to understand the meaning of this test.
> 
> What are we testing exactly? what are we trying to validate?

The test is about validating that the cluster remains functional when one of the DNS servers becomes unreachable.

> The test, as I see it, it's checking of glibc resolver is fast enough in
> recognizing that a DNS is faulty and move to the next one.
> 
> It's entirely possible that what you see here is simply the resolver
> returning a timeout to rabbitmq (that could be racy btw and explain why you
> might see it in IPv6 and not in IPv4) and rabbitmq is correctly reporting
> the error.
> 
> The only course of action here is to fix the infrastructure.

I'm not sure about this, the point of using multiple DNS servers is to automatically cover this kind of situation when one becomes unavailable. Otherwise the connection to one DNS server becomes a single point of failure for the cluster. 

Say one needs to get one DNS server down for maintenance then one should also expect the Openstack services to become unavaialble? This doesn't sound acceptable to me.   

> I consider rabbitmq behavior correct in returning the error and failing vs
> trying to recover.

Comment 10 Fabio Massimo Di Nitto 2016-07-28 15:05:36 UTC

(In reply to Marius Cornea from comment #9)
> (In reply to Fabio Massimo Di Nitto from comment #8)
> > So first of all, I'd like to understand the meaning of this test.
> > 
> > What are we testing exactly? what are we trying to validate?
> 
> The test is about validating that the cluster remains functional when one of
> the DNS servers becomes unreachable.

Ok, then I'd like to see the tcpdumps between the hosts and the second DNS server when this test is running to isolate the issue between rabbitmq <-> glibc resolver <-> functional DNS.

It's entirely possible that due to caching the second DNS failed to respond in time and a timeout was triggered by the resolver to rabbitmq.

> 
> > The test, as I see it, it's checking of glibc resolver is fast enough in
> > recognizing that a DNS is faulty and move to the next one.
> > 
> > It's entirely possible that what you see here is simply the resolver
> > returning a timeout to rabbitmq (that could be racy btw and explain why you
> > might see it in IPv6 and not in IPv4) and rabbitmq is correctly reporting
> > the error.
> > 
> > The only course of action here is to fix the infrastructure.
> 
> I'm not sure about this, the point of using multiple DNS servers is to
> automatically cover this kind of situation when one becomes unavailable.
> Otherwise the connection to one DNS server becomes a single point of failure
> for the cluster. 

I agree, but see above. You might be hitting a double failure.

> 
> Say one needs to get one DNS server down for maintenance then one should
> also expect the Openstack services to become unavaialble? This doesn't sound
> acceptable to me.

I agree it should work, but right now the test, as it is described here, doesn't differentiate between OpenStack services failing or glibc resolver (taking too long to respond) or infrastructure issues (second DNS doesn't reply and hence you hit a timeout).

So if we really want to make sure that the test is testing OSP, you want to collect data from lower levels to make sure they are all behaving properly when there is a failure at higher levels and be able to differentiate.

Make sense?


> 
> > I consider rabbitmq behavior correct in returning the error and failing vs
> > trying to recover.

Comment 11 Fabio Massimo Di Nitto 2016-07-28 15:34:18 UTC

(In reply to Peter Lemenkov from comment #5)
> This isn't something related to IPv4 or IPv6. Inaccessible DNS server causes
> timeouts in rabbitmqctl script, so pcs believes that the resource has failed.
> 
> Also this affects connectivity between RabbitMQ nodes somehow:
> 
> 
> =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> rabbit on node 'rabbit@overcloud-controller-1' down
> 
> =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> Keep rabbit@overcloud-controller-1 listeners: the node is already back
> 
> =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> Mirrored queue 'q-reports-plugin_fanout_1dd728a4a40346a685366957e5ce83ad' in
> vhost '/': Master <rabbit.8780.0> saw deaths of
> mirrors <rabbit.9482.0>
> 
> =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> Mirrored queue 'conductor_fanout_4b9e5610aa3a4a6985b436780322df77' in vhost
> '/': Slave <rabbit.15504.0> saw deaths of mirrors
> <rabbit.17655.0>
> 
> ...
> 
> =INFO REPORT==== 27-Jul-2016::11:48:44 ===
> node 'rabbit@overcloud-controller-2' down: connection_closed
> 
> =WARNING REPORT==== 27-Jul-2016::11:48:44 ===
> Cluster minority/secondary status detected - awaiting recovery

it might be caused by excessive iptables that to isolate access to DNS are also blocking rabbitmq to talk to each other?

Comment 12 Marius Cornea 2016-07-28 19:39:40 UTC

> So if we really want to make sure that the test is testing OSP, you want to
> collect data from lower levels to make sure they are all behaving properly
> when there is a failure at higher levels and be able to differentiate.
> 
> Make sense?
> 

Yes, definitely. I can gather whatever info is needed to figure out what is causing this. I'm going to get the tcpdump info and attach it.

Comment 13 Marius Cornea 2016-07-28 19:51:28 UTC

Created attachment 1185257 [details]
tcpdump + rabbitmq logs

Attaching the tcpdump info and the rabbitmq logs from the controllers. The tcpdump filter that I used:

tcpdump -i eth0 ip src or dst 10.11.5.19 -w controller-1.pcap where 10.11.5.19 is the 2nd DNS server in resolv.conf and eth0 is the interface that provides connectivity to 10.11.5.19

From what I can see there are lots of A queries sent for overcloud-controller-x.localdomain that don't work because 10.11.5.19 doesn't have any record for these names.

In case of IPv4, the controllers don't need to make these queries because the IPv4 addresses for overcloud-controller-x.localdomain are stored in resolv.conf. In case of IPv6 deployments resolv.conf stores IPv6 addresses hence the A queries are sent to the resolvers. 

I'm not sure which service generates these queries but just noticing what the difference between IPv4 and IPv6 deployments might be.

Comment 14 Marius Cornea 2016-07-28 19:55:04 UTC

(In reply to Fabio Massimo Di Nitto from comment #11)
> (In reply to Peter Lemenkov from comment #5)
> > This isn't something related to IPv4 or IPv6. Inaccessible DNS server causes
> > timeouts in rabbitmqctl script, so pcs believes that the resource has failed.
> > 
> > Also this affects connectivity between RabbitMQ nodes somehow:
> > 
> > 
> > =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> > rabbit on node 'rabbit@overcloud-controller-1' down
> > 
> > =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> > Keep rabbit@overcloud-controller-1 listeners: the node is already back
> > 
> > =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> > Mirrored queue 'q-reports-plugin_fanout_1dd728a4a40346a685366957e5ce83ad' in
> > vhost '/': Master <rabbit.8780.0> saw deaths of
> > mirrors <rabbit.9482.0>
> > 
> > =INFO REPORT==== 27-Jul-2016::11:47:29 ===
> > Mirrored queue 'conductor_fanout_4b9e5610aa3a4a6985b436780322df77' in vhost
> > '/': Slave <rabbit.15504.0> saw deaths of mirrors
> > <rabbit.17655.0>
> > 
> > ...
> > 
> > =INFO REPORT==== 27-Jul-2016::11:48:44 ===
> > node 'rabbit@overcloud-controller-2' down: connection_closed
> > 
> > =WARNING REPORT==== 27-Jul-2016::11:48:44 ===
> > Cluster minority/secondary status detected - awaiting recovery
> 
> it might be caused by excessive iptables that to isolate access to DNS are
> also blocking rabbitmq to talk to each other?

No, I didn't touch the iptables rules. The restriciton was done by setting up a null route for the IP address of the DNS server.

Comment 15 Fabio Massimo Di Nitto 2016-07-29 03:42:07 UTC

(In reply to Marius Cornea from comment #13)
> Created attachment 1185257 [details]
> tcpdump + rabbitmq logs
> 
> Attaching the tcpdump info and the rabbitmq logs from the controllers. The
> tcpdump filter that I used:
> 
> tcpdump -i eth0 ip src or dst 10.11.5.19 -w controller-1.pcap where
> 10.11.5.19 is the 2nd DNS server in resolv.conf and eth0 is the interface
> that provides connectivity to 10.11.5.19
> 
> From what I can see there are lots of A queries sent for
> overcloud-controller-x.localdomain that don't work because 10.11.5.19
> doesn't have any record for these names.

does the primary DNS have record for those names? If yes, then the secondary DNS should have them too by definition of being secondary DNS.

> 
> In case of IPv4, the controllers don't need to make these queries because
> the IPv4 addresses for overcloud-controller-x.localdomain are stored in
> resolv.conf. In case of IPv6 deployments resolv.conf stores IPv6 addresses
> hence the A queries are sent to the resolvers. 
> 
> I'm not sure which service generates these queries but just noticing what
> the difference between IPv4 and IPv6 deployments might be.

I see two potential issues in this deployment.

1) in a dual stack environment (v4 and v6) resolv.conf should contain entries for both protocols. This should be addressed in tripleo deployments.

2) there is at least an application out there that prefers v4 to v6 that is against RFC, but it's not completely wrong either in dual stack environments such as this one.

I think the offending code in rabbit might be here:

rabbit_common/src/rabbit_networking.erl:

gethostaddr(Host, auto) ->
    Lookups = [{Family, inet:getaddr(Host, Family)} || Family <- [inet, inet6]],
    case [{IP, Family} || {Family, {ok, IP}} <- Lookups] of
        []  -> host_lookup_error(Host, Lookups);
        IPs -> IPs
    end;

whereas it should do [inet6, inet] in sequence, but I am no rabbit expert :)

Comment 16 Fabio Massimo Di Nitto 2016-07-29 03:53:38 UTC

Also, looking at pcap file, i can see some IPv6 queries that are going answered and that could be the cause of the final failure.

Comment 17 Marius Cornea 2016-07-29 07:36:32 UTC

(In reply to Fabio Massimo Di Nitto from comment #15)
> (In reply to Marius Cornea from comment #13)
> > Created attachment 1185257 [details]
> > tcpdump + rabbitmq logs
> > 
> > Attaching the tcpdump info and the rabbitmq logs from the controllers. The
> > tcpdump filter that I used:
> > 
> > tcpdump -i eth0 ip src or dst 10.11.5.19 -w controller-1.pcap where
> > 10.11.5.19 is the 2nd DNS server in resolv.conf and eth0 is the interface
> > that provides connectivity to 10.11.5.19
> > 
> > From what I can see there are lots of A queries sent for
> > overcloud-controller-x.localdomain that don't work because 10.11.5.19
> > doesn't have any record for these names.
> 
> does the primary DNS have record for those names? If yes, then the secondary
> DNS should have them too by definition of being secondary DNS.

No, the primary DNS server doesn't have records for these names either. 

> > 
> > In case of IPv4, the controllers don't need to make these queries because
> > the IPv4 addresses for overcloud-controller-x.localdomain are stored in
> > resolv.conf. In case of IPv6 deployments resolv.conf stores IPv6 addresses
> > hence the A queries are sent to the resolvers. 
> > 
> > I'm not sure which service generates these queries but just noticing what
> > the difference between IPv4 and IPv6 deployments might be.
> 
> I see two potential issues in this deployment.
> 
> 1) in a dual stack environment (v4 and v6) resolv.conf should contain
> entries for both protocols. This should be addressed in tripleo deployments.
> 
> 2) there is at least an application out there that prefers v4 to v6 that is
> against RFC, but it's not completely wrong either in dual stack environments
> such as this one.
> 
> I think the offending code in rabbit might be here:
> 
> rabbit_common/src/rabbit_networking.erl:
> 
> gethostaddr(Host, auto) ->
>     Lookups = [{Family, inet:getaddr(Host, Family)} || Family <- [inet,
> inet6]],
>     case [{IP, Family} || {Family, {ok, IP}} <- Lookups] of
>         []  -> host_lookup_error(Host, Lookups);
>         IPs -> IPs
>     end;
> 
> whereas it should do [inet6, inet] in sequence, but I am no rabbit expert :)

Comment 18 Fabio Massimo Di Nitto 2016-07-29 08:34:23 UTC

(In reply to Marius Cornea from comment #17)
> (In reply to Fabio Massimo Di Nitto from comment #15)
> > (In reply to Marius Cornea from comment #13)
> > > Created attachment 1185257 [details]
> > > tcpdump + rabbitmq logs
> > > 
> > > Attaching the tcpdump info and the rabbitmq logs from the controllers. The
> > > tcpdump filter that I used:
> > > 
> > > tcpdump -i eth0 ip src or dst 10.11.5.19 -w controller-1.pcap where
> > > 10.11.5.19 is the 2nd DNS server in resolv.conf and eth0 is the interface
> > > that provides connectivity to 10.11.5.19
> > > 
> > > From what I can see there are lots of A queries sent for
> > > overcloud-controller-x.localdomain that don't work because 10.11.5.19
> > > doesn't have any record for these names.
> > 
> > does the primary DNS have record for those names? If yes, then the secondary
> > DNS should have them too by definition of being secondary DNS.
> 
> No, the primary DNS server doesn't have records for these names either.

Ok, in this case we should change the bug title entirely. It's nothing to do with DNS on/off, but rather how resolv.conf is configured.

Any given application will go out looking for DNS lookups unless:
1) entry is in local cache (if enabled)
2) it doesn't resolve in either resolv.conf or other configurable means.
 
> 
> > > 
> > > In case of IPv4, the controllers don't need to make these queries because
> > > the IPv4 addresses for overcloud-controller-x.localdomain are stored in
> > > resolv.conf. In case of IPv6 deployments resolv.conf stores IPv6 addresses
> > > hence the A queries are sent to the resolvers. 
> > > 
> > > I'm not sure which service generates these queries but just noticing what
> > > the difference between IPv4 and IPv6 deployments might be.
> > 
> > I see two potential issues in this deployment.
> > 
> > 1) in a dual stack environment (v4 and v6) resolv.conf should contain
> > entries for both protocols. This should be addressed in tripleo deployments.
> > 

Could you be so kind to retest by manually adding entries for ipv4 (and ipv6) in resolv.conf?

> > 2) there is at least an application out there that prefers v4 to v6 that is
> > against RFC, but it's not completely wrong either in dual stack environments
> > such as this one.
> > 
> > I think the offending code in rabbit might be here:
> > 
> > rabbit_common/src/rabbit_networking.erl:
> > 
> > gethostaddr(Host, auto) ->
> >     Lookups = [{Family, inet:getaddr(Host, Family)} || Family <- [inet,
> > inet6]],
> >     case [{IP, Family} || {Family, {ok, IP}} <- Lookups] of
> >         []  -> host_lookup_error(Host, Lookups);
> >         IPs -> IPs
> >     end;
> > 
> > whereas it should do [inet6, inet] in sequence, but I am no rabbit expert :)

Just to clarify, RFC says:

- If IPv6 is configured on the host, then IPv6 should be preferred over IPv4
- If IPv6 can't be used, then fallback to IPv4

Clearly there are many permutation of the above depending on user and specific application options/behavior.

Petr can you please take a look at how rabbit behaves internally?

Comment 19 Fabio Massimo Di Nitto 2016-07-29 08:57:14 UTC

(In reply to Fabio Massimo Di Nitto from comment #18)
> (In reply to Marius Cornea from comment #17)
> > (In reply to Fabio Massimo Di Nitto from comment #15)
> > > (In reply to Marius Cornea from comment #13)
> > > > Created attachment 1185257 [details]
> > > > tcpdump + rabbitmq logs
> > > > 
> > > > Attaching the tcpdump info and the rabbitmq logs from the controllers. The
> > > > tcpdump filter that I used:
> > > > 
> > > > tcpdump -i eth0 ip src or dst 10.11.5.19 -w controller-1.pcap where
> > > > 10.11.5.19 is the 2nd DNS server in resolv.conf and eth0 is the interface
> > > > that provides connectivity to 10.11.5.19
> > > > 
> > > > From what I can see there are lots of A queries sent for
> > > > overcloud-controller-x.localdomain that don't work because 10.11.5.19
> > > > doesn't have any record for these names.
> > > 
> > > does the primary DNS have record for those names? If yes, then the secondary
> > > DNS should have them too by definition of being secondary DNS.
> > 
> > No, the primary DNS server doesn't have records for these names either.
> 
> Ok, in this case we should change the bug title entirely. It's nothing to do
> with DNS on/off, but rather how resolv.conf is configured.
> 
> Any given application will go out looking for DNS lookups unless:
> 1) entry is in local cache (if enabled)
> 2) it doesn't resolve in either resolv.conf or other configurable means.

My apologies, I meant to say how /etc/hosts is configured/populated.

>  
> > 
> > > > 
> > > > In case of IPv4, the controllers don't need to make these queries because
> > > > the IPv4 addresses for overcloud-controller-x.localdomain are stored in
> > > > resolv.conf. In case of IPv6 deployments resolv.conf stores IPv6 addresses
> > > > hence the A queries are sent to the resolvers. 
> > > > 
> > > > I'm not sure which service generates these queries but just noticing what
> > > > the difference between IPv4 and IPv6 deployments might be.
> > > 
> > > I see two potential issues in this deployment.
> > > 
> > > 1) in a dual stack environment (v4 and v6) resolv.conf should contain
> > > entries for both protocols. This should be addressed in tripleo deployments.
> > > 
> 
> Could you be so kind to retest by manually adding entries for ipv4 (and
> ipv6) in resolv.conf?
> 
> > > 2) there is at least an application out there that prefers v4 to v6 that is
> > > against RFC, but it's not completely wrong either in dual stack environments
> > > such as this one.
> > > 
> > > I think the offending code in rabbit might be here:
> > > 
> > > rabbit_common/src/rabbit_networking.erl:
> > > 
> > > gethostaddr(Host, auto) ->
> > >     Lookups = [{Family, inet:getaddr(Host, Family)} || Family <- [inet,
> > > inet6]],
> > >     case [{IP, Family} || {Family, {ok, IP}} <- Lookups] of
> > >         []  -> host_lookup_error(Host, Lookups);
> > >         IPs -> IPs
> > >     end;
> > > 
> > > whereas it should do [inet6, inet] in sequence, but I am no rabbit expert :)
> 
> Just to clarify, RFC says:
> 
> - If IPv6 is configured on the host, then IPv6 should be preferred over IPv4
> - If IPv6 can't be used, then fallback to IPv4
> 
> Clearly there are many permutation of the above depending on user and
> specific application options/behavior.
> 
> Petr can you please take a look at how rabbit behaves internally?

Is it possible that rabbit is still talking on IPv4 because of https://bugzilla.redhat.com/show_bug.cgi?id=1347802?

Marian can you check if your deployment has the fix for 1347802?

Comment 20 Marius Cornea 2016-07-29 10:06:17 UTC

After manually adding an IPv4 entry in /etc/hosts for the overcloud-controller-x.localdomain names I wasn't able to reproduce this issue:

fd00:fd00:fd00:2000::15 overcloud-controller-0.localdomain overcloud-controller-0
+ 192.168.0.13 overcloud-controller-0.localdomain overcloud-controller-0

fd00:fd00:fd00:2000::12 overcloud-controller-1.localdomain overcloud-controller-1
+ 192.168.0.12 overcloud-controller-1.localdomain overcloud-controller-1

fd00:fd00:fd00:2000::14 overcloud-controller-2.localdomain overcloud-controller-2
+ 192.168.0.14 overcloud-controller-2.localdomain overcloud-controller-2

The problem with this workaround is that the mapping doesn't really reflect the isolated networks because overcloud-controller-0.localdomain points to the internal_api network while the added IPv4 address is associated to the ctlplane network.

Chatting with Fabio on IRC we reached the following conclusion:

@fabbione ╡ mcornea: i need to think about it a minute or two                                                                                                             
        ⤷ ╡ either way my take is:                                                                                                                                        
        ⤷ ╡ 1) there is a workaround even tho it's not ideal                                                                                                              
        ⤷ ╡ 2) it only happens in a very specific corner case (dns down, and yet I don't even understand why we are going out to talk to a DNS with /etc/hosts populated) 
  mcornea ╡ well, as you said if some app is doing v4 resolution first then it makes sense, because there's no v4 entry for overcloud-controller-0.localdomain            
@fabbione ╡ I think it's safe to say that the bug is not a blocker                                                                                                        
        ⤷ ╡ but we need to continue investigating                                                                                                                         
        ⤷ ╡ if the requirement is to run on a pure internal IPv6 network                                                                                                  
        ⤷ ╡ then we need to go around looing at how each app behaves in that respect                                                                                      
        ⤷ ╡ right, but then they shuold go back to Ipv6                                                                                                                   
        ⤷ ╡ and they don't                                                                                                                                                
        ⤷ ╡ so they trust the first response from the DNs (entry not found)                                                                                               
        ⤷ ╡ and fail                                                                                                                                                      
        ⤷ ╡ that is incorrect                                                                                                                                             
  mcornea ╡ right                                                                                                                                                         
@fabbione ╡ mcornea: does it sound correct to you?                                                                                                                        
        ⤷ ╡ ok

Comment 21 Fabio Massimo Di Nitto 2016-07-29 11:03:55 UTC

I found the reference to the RFC recommendations on how to port applications to IPv6:

https://tools.ietf.org/html/rfc4038#page-11

=== quoting ===



   Implementations typically prefer IPv6 by default if the remote node
   and application support it.  However, if IPv6 connections fail,
   version-independent applications will automatically try IPv4 ones.
   The resolver returns a list of valid addresses for the remote node,
   and applications can iterate through all of them until connection
   succeeds.

=== end of quote ===

That said, this is only a recommendation. Even an Ipv6 capable application could choose (or should be allowed to choose) IPv4 in case IPv6 fails to connect.

So Petr, can you please check the logs again here on why rabbit failed to connect in IPv6 in the first place?

Comment 22 Marius Cornea 2016-07-29 12:36:15 UTC

An additional important note about the workaround: make sure the IPv4 entries are placed before the HEAT_HOSTS_START section in /etc/hosts so they don't get overwritten when the next overcloud deploy command is ran. Below is such example:

[root@overcloud-controller-2 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.13 overcloud-controller-0.localdomain overcloud-controller-0
192.168.0.12 overcloud-controller-1.localdomain overcloud-controller-1
192.168.0.14 overcloud-controller-2.localdomain overcloud-controller-2

# HEAT_HOSTS_START - Do not edit manually within this section!
fd00:fd00:fd00:2000::13 overcloud-compute-0.localdomain overcloud-compute-0
192.168.0.11 overcloud-compute-0-external
fd00:fd00:fd00:2000::13 overcloud-compute-0-internalapi
fd00:fd00:fd00:3000::15 overcloud-compute-0-storage
192.168.0.11 overcloud-compute-0-storagemgmt
10.0.1.139 overcloud-compute-0-tenant
192.168.0.11 overcloud-compute-0-management

fd00:fd00:fd00:2000::15 overcloud-controller-0.localdomain overcloud-controller-0
2001:db8:ca2:4::12 overcloud-controller-0-external
fd00:fd00:fd00:2000::15 overcloud-controller-0-internalapi
fd00:fd00:fd00:3000::13 overcloud-controller-0-storage
fd00:fd00:fd00:4000::14 overcloud-controller-0-storagemgmt
10.0.1.140 overcloud-controller-0-tenant
192.168.0.13 overcloud-controller-0-management

fd00:fd00:fd00:2000::12 overcloud-controller-1.localdomain overcloud-controller-1
2001:db8:ca2:4::11 overcloud-controller-1-external
fd00:fd00:fd00:2000::12 overcloud-controller-1-internalapi
fd00:fd00:fd00:3000::11 overcloud-controller-1-storage
fd00:fd00:fd00:4000::11 overcloud-controller-1-storagemgmt
10.0.1.138 overcloud-controller-1-tenant
192.168.0.12 overcloud-controller-1-management

fd00:fd00:fd00:2000::14 overcloud-controller-2.localdomain overcloud-controller-2
2001:db8:ca2:4::13 overcloud-controller-2-external
fd00:fd00:fd00:2000::14 overcloud-controller-2-internalapi
fd00:fd00:fd00:3000::12 overcloud-controller-2-storage
fd00:fd00:fd00:4000::13 overcloud-controller-2-storagemgmt
10.0.1.141 overcloud-controller-2-tenant
192.168.0.14 overcloud-controller-2-management



fd00:fd00:fd00:3000::14 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0
192.168.0.30 overcloud-cephstorage-0-external
192.168.0.30 overcloud-cephstorage-0-internalapi
fd00:fd00:fd00:3000::14 overcloud-cephstorage-0-storage
fd00:fd00:fd00:4000::12 overcloud-cephstorage-0-storagemgmt
192.168.0.30 overcloud-cephstorage-0-tenant
192.168.0.30 overcloud-cephstorage-0-management
# HEAT_HOSTS_END

Comment 25 Peter Lemenkov 2017-06-01 10:59:08 UTC

I cant reproduce it using packages from OSP 11. Marius, could you please retest it on OSP 11?

Here is what I tested so far.

An OSP 11 machine with IPv4 and IPv6 addresses, a /etc/resolv.conf containing both IPv4 and IPv6 DNS records:

===============================

# Generated by NetworkManager
search redhat.local
nameserver 172.16.0.1
nameserver 2620:52:0:13b8::fe
nameserver 10.0.0.1

===============================

I've tried to disable 1 or 2 DNS records in every combination, but no luck - everything worked w/o any extra timeouts.

Here are the package versions:

erlang-erts-18.3.4.4-1.el7ost.x86_64
rabbitmq-server-3.6.5-5.el7ost.noarch
resource-agents-3.9.5-82.el7_3.11.x86_64

Comment 26 Marius Cornea 2017-06-07 13:29:20 UTC

(In reply to Peter Lemenkov from comment #25)
> I cant reproduce it using packages from OSP 11. Marius, could you please
> retest it on OSP 11?
> 

I accidentally tested the same scenario as reported initially on OSP10 and this is not reproducing anymore. I'll test it with OSP11 as well and get back with my results.

Comment 27 Marius Cornea 2017-06-07 15:32:39 UTC

Same goes for OSP11, the initial report doesn't reproduce anymore.

Comment 28 Peter Lemenkov 2017-06-14 14:51:08 UTC

Although we failed to reproduce it recently, I have a feeling  that we finally catched it. it turned out that if IPv6 is configured, Erlang prior to 19.0 version still tries to use IPv4 for domain name resolution. We've opened bug 1461190 for this issue.

Comment 29 Peter Lemenkov 2017-10-25 12:34:04 UTC

(In reply to Marius Cornea from comment #27)
> Same goes for OSP11, the initial report doesn't reproduce anymore.

Marius, could you please retest this issue with the recent packages?

Note You need to log in before you can comment on or make changes to this bug.