Bug 1117886 - Gluster not resolving hosts with IPv6 only lookups
Summary: Gluster not resolving hosts with IPv6 only lookups
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: All
OS: All
medium
medium
Target Milestone: ---
Assignee: Nithin
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: glusterfs-3.6.0 1310445
TreeView+ depends on / blocked
 
Reported: 2014-07-09 15:17 UTC by Anders Blomdell
Modified: 2016-06-16 12:39 UTC (History)
13 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 922801
: 1310445 (view as bug list)
Environment:
Last Closed: 2016-06-16 12:39:37 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Reversal of commit 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (7.36 KB, patch)
2014-07-09 15:17 UTC, Anders Blomdell
no flags Details | Diff
IPv6 bug fixes patch (10.61 KB, patch)
2015-06-11 18:34 UTC, Nithin
no flags Details | Diff
Manual test log 1 (72.30 KB, text/plain)
2015-08-23 05:48 UTC, Nithin
no flags Details
Manual test log 2 (39.68 KB, text/plain)
2015-08-23 05:49 UTC, Nithin
no flags Details
manual_test_run2_peer1.log (150.14 KB, text/plain)
2015-08-27 18:27 UTC, Nithin
no flags Details
manual_test_run2_peer2.log (78.56 KB, text/plain)
2015-08-27 18:28 UTC, Nithin
no flags Details
manual_test_run2_client.log (48.48 KB, text/plain)
2015-08-27 18:28 UTC, Nithin
no flags Details
Regression log - IPv4 (450.35 KB, text/plain)
2015-09-13 16:23 UTC, Nithin
no flags Details
Regression log -IPv6 (383.56 KB, text/plain)
2015-09-13 16:24 UTC, Nithin
no flags Details

Description Anders Blomdell 2014-07-09 15:17:58 UTC
Created attachment 916832 [details]
Reversal of commit 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff

+++ This bug was initially created as a clone of Bug #922801 +++

Description of problem:

  As reported on IRC by alex88, gluster isn't resolving hosts that have
  only IPv6 DNS entries (ie AAAA instead of A).

    [2013-03-18 10:23:42.211253] I [glusterfsd.c:1666:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.3.1
    [2013-03-18 10:23:42.213462] E [common-utils.c:125:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)
    [2013-03-18 10:23:42.213537] E [name.c:243:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host storage.site.com
    [2013-03-18 10:23:42.213609] E [glusterfsd-mgmt.c:1787:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Success
    [2013-03-18 10:23:42.213649] I [glusterfsd-mgmt.c:1790:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
    [2013-03-18 10:23:42.213846] W [glusterfsd.c:831:cleanup_and_exit] (-->/usr/sbin/glusterfs(glusterfs_mgmt_init+0x1ff) [0x7f38a41a41af] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_start+0x12) [0x7f38a3af0922] (-->/usr/sbin/glusterfs(+0xd486) [0x7f38a41a4486]))) 0-: received signum (1), shutting down
    [2013-03-18 10:23:42.213905] I [fuse-bridge.c:4648:fini] 0-fuse: Unmounting '/mnt/site-development/'.


Version-Release number of selected component (if applicable):

  glusterfs 3.3.1 built on Feb 21 2013 03:24:40
  "got by the ubuntu ppa"


How reproducible:

  Every time.


Steps to Reproduce:

  1. Install gluster.
  2. Attempt to use a host that only has AAAA DNS records.
  3. Fails here.

--- Additional comment from Joe Julian on 2014-04-18 20:15:50 EDT ---

We need IPv6 for 3.6. Can we make this a priority for this cycle?

--- Additional comment from Justin Clift on 2014-04-18 21:59:30 EDT ---

It'll probably depend on the code for "Better Peer Identification" getting in first:

  http://www.gluster.org/community/documentation/index.php/Features/Better_peer_identification

If that gets in with sufficient lead time, it might be possible to add IPv6 support for 3.6 as well.  Probably need to ask Kaushal if he thinks its feasible.

--- Additional comment from Anders Blomdell on 2014-06-11 06:44:36 EDT ---

This is still a problem with 3.5.0, it looks like almost all ipv6 support got ripped out with commit 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff :-(

--- Additional comment from Adam Huffman on 2014-06-12 08:19:18 EDT ---

I see similar problems in a dual-stack setup.

--- Additional comment from Anders Blomdell on 2014-06-12 10:16:26 EDT ---

Reverting 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (with one small manual merge) made IPv6 work for me. :-)

Let's hope it gets into 3.6.

Comment 1 Anders Blomdell 2014-07-09 15:20:53 UTC
Reinstating ipv6 support by reverting 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff
    
Make sure to check for regressions in https://bugzilla.redhat.com/show_bug.cgi?id=764655

Comment 2 Anand Avati 2014-07-10 18:37:12 UTC
REVIEW: http://review.gluster.org/8292 (Reinstating ipv6 support by reverting 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff) posted (#2) for review on master by Anders Blomdell (anders.blomdell.se)

Comment 3 Anand Avati 2014-07-10 18:39:17 UTC
REVIEW: http://review.gluster.org/8292 (Reinstate ipv6 support) posted (#3) for review on master by Anders Blomdell (anders.blomdell.se)

Comment 4 Anand Avati 2014-07-11 16:02:43 UTC
REVIEW: http://review.gluster.org/8292 (Reinstate ipv6 support) posted (#4) for review on master by Anders Blomdell (anders.blomdell.se)

Comment 5 Gulf Zhou 2014-12-12 08:20:36 UTC
(In reply to Anders Blomdell from comment #1)
> Reinstating ipv6 support by reverting
> 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff
>     
> Make sure to check for regressions in
> https://bugzilla.redhat.com/show_bug.cgi?id=764655

Have this been done on 3.5 branch? Thanks.

Comment 6 Anders Blomdell 2014-12-12 09:05:10 UTC
(In reply to Gulf Zhou from comment #5)
> Have this been done on 3.5 branch? Thanks.

No, and probably shouldn't. The problem was more complex than I anticipated, the problem occurs in multiple places, and the lookup of hosts and connecting to them is unfortunately decoupled, which means that replacing gethostbyname with getaddrinfo will (always?) return the IPv6 address, which means that hosts without IPv6 connectivity will be hosed (this was the AFAICT the [unexplained/unexplored] reason for removing IPv6 support in the first place.

My current plan is to wait for:

http://supercolony.gluster.org/pipermail/gluster-devel/2014-December/043180.html

to get some momentum before I look into it again, since it looks like it has the potential to provide the right decoupling for network connections.

Comment 7 Gulf Zhou 2014-12-15 01:18:09 UTC
Thanks, Anders. Read through the thread. The IPv6 support is very important feature for us. How will this be prioritized? Don't have the feeling when it can be solved. Any comments?

Comment 8 Gulf Zhou 2015-03-13 01:20:57 UTC
Hi, Anders

   Is there any update for the IPv6 support fix? 

   Thanks.

Comment 9 Anders Blomdell 2015-03-13 12:10:51 UTC
Waiting for 4.6 (or 4.7) to get stable enough in my lab before I continue with IPv6, :-/

Comment 10 Anders Blomdell 2015-03-13 12:12:25 UTC
I meant: 3.6 or 3.7

Comment 11 Nithin 2015-06-11 18:33:11 UTC
Hi Niels & Anders,

Can I contribute to this bug fix ? I've worked on Gluster IPv6 functionality bugs in 3.3.2 in my past organization and was able to successfully bring up gluster on IPv6 link local addresses as well.

Please find my work in progress patch. I was successfully able to create volumes with 3 peers and add bricks. I'll continue testing other basic functionality and see when needs to be modified.

Brief info about the patch:
Here I'm trying to use "transport.address-family" option in /etc/glusterfs/glusterd.vol file and then propagate the same to server and client vol files and their translators.

In this way when user mentions "transport.address-family inet6" in its glusterd.vol file, all glusterd servers open AF_INET6 sockets and then the same information is stored in glusterd_volinfo and used when generating vol config files.
 
-thanks
Nithin

Comment 12 Nithin 2015-06-11 18:34:32 UTC
Created attachment 1037799 [details]
IPv6 bug fixes patch

Comment 13 joachim 2015-08-12 17:42:53 UTC
Hi,

So, trying to get glusterfs working in an IPv6-only environment. Tried two freshly installed CentOS-boxes, both having only IPv6-adresses and quad-A DNS-records.

Tried latest glusterfs 3.5, 3.6 and 3.7. In all of them, glusterd by-default only listens to IPv4 (tcp 24007), hence initial 'probe' doesn't work.

Am I understanding this bug correctly that _all_ IPv6-support has been (temporarily) removed until this bug is resolved?

Comment 14 Nithin 2015-08-15 15:29:23 UTC
(In reply to Joachim Tingvold from comment #13)
> Hi,
> 
> So, trying to get glusterfs working in an IPv6-only environment. Tried two
> freshly installed CentOS-boxes, both having only IPv6-adresses and quad-A
> DNS-records.
> 
> Tried latest glusterfs 3.5, 3.6 and 3.7. In all of them, glusterd by-default
> only listens to IPv4 (tcp 24007), hence initial 'probe' doesn't work.
> 
> Am I understanding this bug correctly that _all_ IPv6-support has been
> (temporarily) removed until this bug is resolved?

Yes, basically this bug is to fix issues that are breaking IPv6 functionality.
Most of the support code is already there but is broken probably due to lack of testing focus on IPv6.

Comment 15 Gulf Zhou 2015-08-18 06:51:09 UTC
Hi, Nithin

     Are the patch you proposed merged back to the glusterfs branches? Or is there any plan to merge it? 


     Thanks.

Comment 16 Nithin 2015-08-19 08:06:43 UTC
Hi Gulf,

I have the patch that is almost same as attached in one of my above comments.
Basic functionality is working, i'm trying to validate if there are any errors and I'm yet to run regression. Once done, I'll raise a review.

Comment 17 Anand Avati 2015-08-22 11:24:47 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: transport address-family inet6 not working) posted (#1) for review on master by Nithin Kumar D (nithind1988)

Comment 18 Nithin 2015-08-23 05:48:39 UTC
Created attachment 1065976 [details]
Manual test log 1

Comment 19 Nithin 2015-08-23 05:49:12 UTC
Created attachment 1065977 [details]
Manual test log 2

Comment 20 Anand Avati 2015-08-27 18:23:24 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: transport address-family inet6 not working) posted (#2) for review on master by Nithin Kumar D (nithind1988)

Comment 21 Nithin 2015-08-27 18:27:49 UTC
Created attachment 1067846 [details]
manual_test_run2_peer1.log

Comment 22 Nithin 2015-08-27 18:28:17 UTC
Created attachment 1067847 [details]
manual_test_run2_peer2.log

Comment 23 Nithin 2015-08-27 18:28:46 UTC
Created attachment 1067848 [details]
manual_test_run2_client.log

Comment 24 Vijay Bellur 2015-09-10 00:48:57 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: transport address-family inet6 not working) posted (#3) for review on master by Nithin Kumar D (nithind1988)

Comment 25 Vijay Bellur 2015-09-13 16:08:14 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#4) for review on master by Nithin Kumar D (nithind1988)

Comment 26 Nithin 2015-09-13 16:23:56 UTC
Created attachment 1072956 [details]
Regression log - IPv4

Comment 27 Nithin 2015-09-13 16:24:38 UTC
Created attachment 1072957 [details]
Regression log -IPv6

Comment 28 Vijay Bellur 2015-09-15 06:24:06 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#5) for review on master by Vijay Bellur (vbellur)

Comment 29 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Comment 30 Vijay Bellur 2015-11-11 06:26:34 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#6) for review on master by Nithin Kumar D (nithind1988)

Comment 31 Vijay Bellur 2015-11-15 17:06:16 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#7) for review on master by Nithin Kumar D (nithind1988)

Comment 32 Vijay Bellur 2015-12-04 00:18:58 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#8) for review on master by Nithin Kumar D (nithind1988)

Comment 33 Vijay Bellur 2015-12-06 09:34:08 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#9) for review on master by Nithin Kumar D (nithind1988)

Comment 34 Vijay Bellur 2016-01-31 16:02:26 UTC
REVIEW: http://review.gluster.org/11988 (glusterd: Bug fixes for IPv6 support) posted (#10) for review on master by Nithin Kumar D (nithind1988)

Comment 35 Anoop C S 2016-02-17 08:59:52 UTC
Re-opening the BZ because we have patches posted which has not yet merged.

Comment 36 Vijay Bellur 2016-02-20 17:16:48 UTC
COMMIT: http://review.gluster.org/11988 committed in master by Jeff Darcy (jdarcy) 
------
commit 46bd29e0f2a7fc9278068a06d12066d614f365ec
Author: Nithin D <nithind1988>
Date:   Sun Nov 15 22:14:43 2015 +0530

    glusterd: Bug fixes for IPv6 support
    
    Problem:
    Glusterd not working using ipv6 transport. The idea is with proper glusterd.vol configuration,
    1. glusterd needs to listen on default port (240007) as IPv6 TCP listner.
    2. Volume creation/deletion/mounting/add-bricks/delete-bricks/peer-probe
       needs to work using ipv6 addresses.
    3. Bricks needs to listen on ipv6 addresses.
    All the above functionality is needed to say that glusterd supports ipv6 transport and this is broken.
    
    Fix:
    When "option transport.address-family inet6" option is present in glusterd.vol
    file, it is made sure that glusterd creates listeners using ipv6 sockets only and also the same information is saved
    inside brick volume files used by glusterfsd brick process when they are starting.
    
    Tests Run:
    Regression tests using ./run-tests.sh
        IPv4: Ran manually till tests/basic/rpm.t .
        IPv6: (Need to add the above mentioned config and also add an entry for "hostname ::1" in /etc/hosts)
            Started failing at ./tests/basic/glusterd/arbiter-volume-probe.t and ran successfully till here
    
    Unit Tests using Ipv6
        peer probe
        add-bricks
        remove-bricks
        create volume
        replace-bricks
        start volume
        stop volume
        delete volume
    
    Change-Id: Iebc96e6cce748b5924ce5da17b0114600ec70a6e
    BUG: 1117886
    Signed-off-by: Nithin D <nithind1988>
    Reviewed-on: http://review.gluster.org/11988
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Atin Mukherjee <amukherj>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 37 bing.kaibing 2016-05-23 08:41:14 UTC

(In reply to Vijay Bellur from comment #36)
> COMMIT: http://review.gluster.org/11988 committed in master by Jeff Darcy
> (jdarcy) 
> ------
> commit 46bd29e0f2a7fc9278068a06d12066d614f365ec
> Author: Nithin D <nithind1988>
> Date:   Sun Nov 15 22:14:43 2015 +0530
> 
>     glusterd: Bug fixes for IPv6 support
>     
>     Problem:
>     Glusterd not working using ipv6 transport. The idea is with proper
> glusterd.vol configuration,
>     1. glusterd needs to listen on default port (240007) as IPv6 TCP listner.
>     2. Volume creation/deletion/mounting/add-bricks/delete-bricks/peer-probe
>        needs to work using ipv6 addresses.
>     3. Bricks needs to listen on ipv6 addresses.
>     All the above functionality is needed to say that glusterd supports ipv6
> transport and this is broken.
>     
>     Fix:
>     When "option transport.address-family inet6" option is present in
> glusterd.vol
>     file, it is made sure that glusterd creates listeners using ipv6 sockets
> only and also the same information is saved
>     inside brick volume files used by glusterfsd brick process when they are
> starting.
>     
>     Tests Run:
>     Regression tests using ./run-tests.sh
>         IPv4: Ran manually till tests/basic/rpm.t .
>         IPv6: (Need to add the above mentioned config and also add an entry
> for "hostname ::1" in /etc/hosts)
>             Started failing at ./tests/basic/glusterd/arbiter-volume-probe.t
> and ran successfully till here
>     
>     Unit Tests using Ipv6
>         peer probe
>         add-bricks
>         remove-bricks
>         create volume
>         replace-bricks
>         start volume
>         stop volume
>         delete volume
>     
>     Change-Id: Iebc96e6cce748b5924ce5da17b0114600ec70a6e
>     BUG: 1117886
>     Signed-off-by: Nithin D <nithind1988>
>     Reviewed-on: http://review.gluster.org/11988
>     Smoke: Gluster Build System <jenkins.com>
>     NetBSD-regression: NetBSD Build System <jenkins.org>
>     CentOS-regression: Gluster Build System <jenkins.com>
>     Reviewed-by: Atin Mukherjee <amukherj>
>     Reviewed-by: Jeff Darcy <jdarcy>

Hi,

   I still meet the same problem in an IPv6-only environment after follow your suggestion  . please help, Here is my env informations:

# glusterfsd --version
glusterfs 3.7.11 built on Apr 27 2016 14:09:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

=====
netstat -anop | grep gluster
tcp        0      0 :::24007                    :::*                        LISTEN      10770/glusterd      off (0.00/0/0)
unix  2      [ ACC ]     STREAM     LISTENING     647165 10770/glusterd      /var/run/glusterd.socket
unix  2      [ ]         DGRAM                    647160 10770/glusterd


===== error when execute "gluster peer prob typhoon-adf-rr2-eth2"


[2016-05-23 08:31:55.179633] I [MSGID: 106487] [glusterd-handler.c:1239:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req typhoon-adf-rr2-eth2 24007
[2016-05-23 08:31:55.198211] I [MSGID: 106129] [glusterd-handler.c:3661:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: typhoon-adf-rr2-eth2 (24007)
[2016-05-23 08:31:55.202330] I [rpc-clnt.c:1004:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2016-05-23 08:31:55.208240] E [MSGID: 101075] [common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Temporary failure in name resolution)
[2016-05-23 08:31:55.208285] E [name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host typhoon-adf-rr2-eth2
[2016-05-23 08:31:55.210194] I [MSGID: 106498] [glusterd-handler.c:3589:glusterd_friend_add] 0-management: connect returned 0
[2016-05-23 08:31:55.210340] I [MSGID: 106004] [glusterd-handler.c:5200:__glusterd_peer_rpc_notify] 0-management: Peer <typhoon-adf-rr2-eth2> (<00000000-0000-0000-0000-000000000000>), in state <Establishing Connection>, has disconnected from glusterd.


====== IPv6 can reach  typhoon-adf-rr2-eth2


ping6  typhoon-adf-rr2-eth2
PING typhoon-adf-rr2-eth2(typhoon-adf-rr2-eth2) 56 data bytes
64 bytes from typhoon-adf-rr2-eth2: icmp_seq=1 ttl=64 time=1.98 ms
64 bytes from typhoon-adf-rr2-eth2: icmp_seq=2 ttl=64 time=0.256 ms
64 bytes from typhoon-adf-rr2-eth2: icmp_seq=3 ttl=64 time=0.289 ms

Comment 38 Atin Mukherjee 2016-05-23 11:05:42 UTC
Actually in 3.7.11 we had to revert a recent change which broke SSL functionality. We are trying to fix it and release it in 3.7.12.

Comment 39 Niels de Vos 2016-06-16 12:39:37 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.