1609163 – Fuse mount of volume fails when gluster_shared_storage is enabled

Bug 1609163 - Fuse mount of volume fails when gluster_shared_storage is enabled

Summary: Fuse mount of volume fails when gluster_shared_storage is enabled

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Sanju
QA Contact:	Jilju Joy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503137 1610726
TreeView+	depends on / blocked

Reported:	2018-07-27 07:52 UTC by Jilju Joy
Modified:	2018-09-12 10:50 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-3.12.2-16
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1610726 (view as bug list)
Environment:
Last Closed:	2018-09-04 06:51:13 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:52:51 UTC

Description Jilju Joy 2018-07-27 07:52:33 UTC

Description of problem:
-----------------------
While mounting(glusterfs) a Distributed-Replicate volume from node where gluster_shared_storage is enabled, mounting fails with below error in log.


[2018-07-27 06:42:44.748162] W [MSGID: 114043] [client-handshake.c:1108:client_setvolume_cbk] 0-testvol-client-0: failed to set the volume [Permission denied]
[2018-07-27 06:42:44.748235] W [MSGID: 114007] [client-handshake.c:1137:client_setvolume_cbk] 0-testvol-client-0: failed to get 'process-uuid' from reply dict [Invalid argument]
[2018-07-27 06:42:44.748268] E [MSGID: 114044] [client-handshake.c:1143:client_setvolume_cbk] 0-testvol-client-0: SETVOLUME on remote-host failed: Authentication failed [Permission denied]
[2018-07-27 06:42:44.748298] I [MSGID: 114049] [client-handshake.c:1257:client_setvolume_cbk] 0-testvol-client-0: sending AUTH_FAILED event
[2018-07-27 06:42:44.748365] E [fuse-bridge.c:5328:notify] 0-fuse: Server authenication failed. Shutting down.

==============================================================
Version-Release number of selected component (if applicable):
-------------------------------------------------------------
[root@dhcp37-132 ~]# rpm -qa | grep glusterfs
glusterfs-server-3.12.2-14.el7rhgs.x86_64
glusterfs-3.12.2-14.el7rhgs.x86_64
glusterfs-ganesha-3.12.2-14.el7rhgs.x86_64
glusterfs-libs-3.12.2-14.el7rhgs.x86_64
glusterfs-fuse-3.12.2-14.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-14.el7rhgs.x86_64
glusterfs-api-3.12.2-14.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-14.el7rhgs.x86_64
glusterfs-rdma-3.12.2-14.el7rhgs.x86_64
glusterfs-cli-3.12.2-14.el7rhgs.x86_64

=================================================================
How reproducible:
-----------------
3/3

================================================================
Steps to Reproduce:
-------------------
1. Create a 6x3 Distributed-Replicate volume in cluster where gluster_shared_storage is enabled
2. Try to mount the volume on fuse client.

================================================================
Actual results:
---------------
Mount operation fails with the below error message:
"Mount failed. Please check the log file for more details."

===============================================================
Expected results:
-----------------
Volume should mount properly.

==============================================================
Additional info:
----------------
* Able to mount volume locally, but not on any other node in the cluster.
* gnfs mount is passing.
* Brick multiplexing is happening with the existing gluster_shared_storage and newly created volume.
* Seems like it fails to distinguish between gluster_shared_storage and testvolume.So authentication process for mounting testvolume is done like the way it has to be done for gluster_shared_storage (possible to mount on localhost only).

===============================================================
gluster-health-report
---------------------

[root@dhcp37-173 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=17
[     OK] Disk used percentage  path=/var  percentage=17
[     OK] Disk used percentage  path=/tmp  percentage=17
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[     OK] op-version is up to date  op_version=  max_op_version=
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1513/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      17293/glusterfsd    
4:tcp        0      0 0.0.0.0:49153           0.0.0.0:*               LISTEN      17333/glusterfsd    

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=5619
[WARNING] Errors in Glusterd log file  num_errors=79
[WARNING] Warnings in Glusterd log file  num_warning=37
[     OK] No errors seen at network card
[     OK] No errors seen at network card
High CPU usage by Self-heal
[WARNING] Errors in Glusterd log file num_errors=199
[WARNING] Warnings in Glusterd log file num_warnings=128

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-34.log
================================================================================

[root@dhcp37-50 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=17
[     OK] Disk used percentage  path=/var  percentage=17
[     OK] Disk used percentage  path=/tmp  percentage=17
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[ NOT OK] Failed to check op-version
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1525/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2369/glusterfsd     
4:tcp        0      0 0.0.0.0:49153           0.0.0.0:*               LISTEN      2378/glusterfsd     

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=5745
[WARNING] Errors in Glusterd log file  num_errors=107
[WARNING] Warnings in Glusterd log file  num_warning=56
[     OK] No errors seen at network card
[     OK] No errors seen at network card
0
[WARNING] Errors in Glusterd log file num_errors=175
[WARNING] Warnings in Glusterd log file num_warnings=187

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-41.log

================================================================================
[root@dhcp37-132 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=15
[     OK] Disk used percentage  path=/var  percentage=15
[     OK] Disk used percentage  path=/tmp  percentage=15
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[ NOT OK] Failed to check op-version
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1505/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2365/glusterfsd     
4:tcp        0      0 0.0.0.0:49153           0.0.0.0:*               LISTEN      2374/glusterfsd     

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=5572
[WARNING] Errors in Glusterd log file  num_errors=109
[WARNING] Warnings in Glusterd log file  num_warning=63
[     OK] No errors seen at network card
[     OK] No errors seen at network card
0
[WARNING] Errors in Glusterd log file num_errors=166
[WARNING] Warnings in Glusterd log file num_warnings=148

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-42.log
================================================================================
[root@dhcp37-172 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=16
[     OK] Disk used percentage  path=/var  percentage=16
[     OK] Disk used percentage  path=/tmp  percentage=16
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[     OK] op-version is up to date  op_version=  max_op_version=
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1528/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2366/glusterfsd     
4:tcp        0      0 0.0.0.0:49153           0.0.0.0:*               LISTEN      4515/glusterfsd     

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=4619
[WARNING] Errors in Glusterd log file  num_errors=56
[WARNING] Warnings in Glusterd log file  num_warning=79
[     OK] No errors seen at network card
[     OK] No errors seen at network card
0
[WARNING] Errors in Glusterd log file num_errors=36136
[WARNING] Warnings in Glusterd log file num_warnings=174

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-43.log

================================================================================
[root@dhcp37-197 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=18
[     OK] Disk used percentage  path=/var  percentage=18
[     OK] Disk used percentage  path=/tmp  percentage=18
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[     OK] op-version is up to date  op_version=  max_op_version=
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1525/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2362/glusterfsd     

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=4477
[WARNING] Errors in Glusterd log file  num_errors=108
[WARNING] Warnings in Glusterd log file  num_warning=68
[     OK] No errors seen at network card
[     OK] No errors seen at network card
0
[WARNING] Errors in Glusterd log file num_errors=170
[WARNING] Warnings in Glusterd log file num_warnings=186

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-44.log

================================================================================
[root@dhcp37-56 ~]# gluster-health-report

Loaded reports: errors_in_logs, disk_usage, gfid-mismatch-dht-report, memory_usage, firewall-check, kernel_issues, glusterd, glusterd-peer-disconnect, coredump, glusterd_volume_version_cksum_errors, glusterd-op-version, georep, errors_in_logs, ifconfig, nic-health, process_status

[     OK] Disk used percentage  path=/  percentage=15
[     OK] Disk used percentage  path=/var  percentage=15
[     OK] Disk used percentage  path=/tmp  percentage=15
[     OK] All peers are in connected state  connected_count=5  total_peer_count=5
[     OK] no gfid mismatch
[     OK] op-version is up to date  op_version=  max_op_version=
[     OK] The maximum size of core files created is set to unlimted.
[     OK] Ports open for glusterd:
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1525/glusterd       

[     OK] Ports open for glusterfsd:
3:tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      2364/glusterfsd     

[  ERROR] Report failure  report=report_check_worker_restarts
[WARNING] Glusterd uptime is less than 24 hours  uptime_sec=4342
[WARNING] Errors in Glusterd log file  num_errors=60
[WARNING] Warnings in Glusterd log file  num_warning=58
[     OK] No errors seen at network card
[     OK] No errors seen at network card
0
[WARNING] Errors in Glusterd log file num_errors=36020
[WARNING] Warnings in Glusterd log file num_warnings=183

....
You can find the detailed health-reportat /var/log/glusterfs/gluster-health-report-2018-07-27-11-45.log

Comment 2 Mohammed Rafi KC 2018-07-27 11:06:22 UTC

RCA:

Gluster shared storage does a couple of more strict authentication to validate the clients. Because shared storage is an internal volume which stores metdata information for gluster features. So any bricks from normal volumes shouldn't be attached to gluster shared storage bricks.

Here the bricks from volume "testvolume" attached to shared storage and it does a strict validation. Because of this reason, volume mount will fail.

A fix for this bug will be in glusterd where we have to modify the code to select compatible bricks for brick multiplexing.

Comment 11 errata-xmlrpc 2018-09-04 06:51:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.