Bug 1688231 - geo-rep session creation fails with IPV6
Summary: geo-rep session creation fails with IPV6
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Aravinda VK
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1688833 1695436
Blocks: 1688239 1696807
TreeView+ depends on / blocked
 
Reported: 2019-03-13 11:34 UTC by SATHEESARAN
Modified: 2023-09-14 05:25 UTC (History)
9 users (show)

Fixed In Version: glusterfs-6.0-2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1688239 1688833 (view as bug list)
Environment:
Last Closed: 2019-10-30 12:20:22 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2019:3249 0 None None None 2019-10-30 12:20:46 UTC

Description SATHEESARAN 2019-03-13 11:34:18 UTC
Description of problem:
-----------------------
This issue is seen with the RHHI-V usecase. VM images are stored in the gluster volumes and geo-replicated to the secondary site, for DR use case.

When IPv6 is used, the additional mount option is required --xlator-option=transport.address-family=inet6". But when geo-rep check for slave space with gverify.sh, these mount options are not considered and it fails to mount either master or slave volume

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHGS 3.4.4 ( glusterfs-3.12.2-47 )

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Create geo-rep session from the master to slave

Actual results:
--------------
Creation of geo-rep session fails at gverify.sh

Expected results:
-----------------
Creation of geo-rep session should be successful

Additional info:

Comment 1 SATHEESARAN 2019-03-13 11:49:02 UTC
[root@ ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
2620:52:0:4624:5054:ff:fee9:57f8 master.lab.eng.blr.redhat.com 
2620:52:0:4624:5054:ff:fe6d:d816 slave.lab.eng.blr.redhat.com 

[root@ ~]# gluster volume info
 
Volume Name: master
Type: Distribute
Volume ID: 9cf0224f-d827-4028-8a45-37f7bfaf1c78
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: master.lab.eng.blr.redhat.com:/gluster/brick1/master
Options Reconfigured:
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
user.cifs: off
features.shard: on
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet6
nfs.disable: on

[root@localhost ~]# gluster volume geo-replication master slave.lab.eng.blr.redhat.com::slave create push-pem
Unable to mount and fetch slave volume details. Please check the log: /var/log/glusterfs/geo-replication/gverify-slavemnt.log
geo-replication command failed


Snip from gverify-slavemnt.log
<snip>
[2019-03-13 11:46:28.746494] I [MSGID: 100030] [glusterfsd.c:2646:main] 0-glusterfs: Started running glusterfs version 3.12.2 (args: glusterfs --xlator-option=*dht.lookup-unhashed=off --volfile-server slave.lab.eng.blr.redhat.com --volfile-id slave -l /var/log/glusterfs/geo-replication/gverify-slavemnt.log /tmp/gverify.sh.y1TCoY)
[2019-03-13 11:46:28.750595] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2019-03-13 11:46:28.753702] E [MSGID: 101075] [common-utils.c:482:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)
[2019-03-13 11:46:28.753725] E [name.c:267:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host slave.lab.eng.blr.redhat.com
[2019-03-13 11:46:28.753953] I [glusterfsd-mgmt.c:2337:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: slave.lab.eng.blr.redhat.com
[2019-03-13 11:46:28.753980] I [glusterfsd-mgmt.c:2358:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2019-03-13 11:46:28.753998] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2019-03-13 11:46:28.754073] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2019-03-13 11:46:28.754154] W [glusterfsd.c:1462:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7fc39d379bab] -->glusterfs(+0x11fcd) [0x56427db95fcd] -->glusterfs(cleanup_and_exit+0x6b) [0x56427db8eb2b] ) 0-: received signum (1), shutting down
[2019-03-13 11:46:28.754197] I [fuse-bridge.c:6611:fini] 0-fuse: Unmounting '/tmp/gverify.sh.y1TCoY'.
[2019-03-13 11:46:28.760213] I [fuse-bridge.c:6616:fini] 0-fuse: Closing fuse connection to '/tmp/gverify.sh.y1TCoY'.
</snip>

Comment 2 Aravinda VK 2019-03-19 09:19:41 UTC
Following changes done to Geo-replication to support IPv6.

- Added ipv6 mount option whenever Gluster volume is mounted by Geo-rep
- Local gluster CLI connects to Glusterd using Unix socket, that is why all Gluster CLI commands are working fine. Geo-rep uses gluster cli with `--remote-host=` option to get the details from remote Glusterd. Fixed Ipv6 handling when remote-host is used.


Known limitations:

- Only FQDN are supported, Geo-rep fails if IPv6 IP is specified instead of FQDN
- IPv6 enabled state is taken from Master Glusterd, Geo-rep will fail if IPv6 is enabled in Master and IPv6 is disabled in Slave(and vice versa)

Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/22363/
Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/165434/

Comment 3 SATHEESARAN 2019-03-19 09:49:19 UTC
Tested the fix with the scratch-build and the fix works great.
1. Able to create geo-rep session 
2. Files from master volume synced to slave volume over IPV6
Checked for data integrity too and there are no problems observed.

This issue will be marked down as a known_issue with RHHI-V, till the fix is included in the build.
Removing the release_blocker for RHGS 3.4.4 set on this bug

Comment 4 Sahina Bose 2019-03-28 07:14:29 UTC
Aravinda, can you ack this bug?

Comment 11 SATHEESARAN 2019-08-29 05:16:37 UTC
Tested with RHGS node with IPV6 turned on and no IPV4, with RHGS 3.5.0 interim build ( glusterfs-6.0-13.el7rhgs )

Setup Details:
1. 3 node master gluster cluster ( Trusted Storage Pool ) and 3 node slave cluster
2. Replica 3 volume as master and slave
3. Glusterd volfile is edited to turn on transport-family inet6 and glusterd service restarted
4. Static IPV6 used with hostnames assigned in /etc/hosts

Steps
1. All the gluster commands used the IPV6 hostnames in /etc/hosts
2. Geo-rep session is estabilished, and enabled gluster shared storage
3. Fuse mount is done on the master node, with additional mount option "xlator-option='transport.address-family=inet6'"
4. Few files are written
5. Geo-rep checkpoint is set and session is started
6. Once the checkpoint is reached, computed sha256sum of all the files on the slave side
7. Checksums on the master side matches with the one on the slave side

Comment 13 errata-xmlrpc 2019-10-30 12:20:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249

Comment 14 Red Hat Bugzilla 2023-09-14 05:25:23 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.