1249339 – Gluster 3.7.3 client fails to mount volume on 3.7.2 servers

Bug 1249339 - Gluster 3.7.3 client fails to mount volume on 3.7.2 servers

Summary: Gluster 3.7.3 client fails to mount volume on 3.7.2 servers

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	rpc
Sub Component:
Version:	3.7.3
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-08-01 19:40 UTC by Michael DePaulo
Modified:	2017-03-08 10:49 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-03-08 10:49:39 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
gv-homes-AD.log showing me trying to mount the volume before and after each server update (129.82 KB, text/plain) 2015-08-01 19:40 UTC, Michael DePaulo	no flags	Details
View All

Description Michael DePaulo 2015-08-01 19:40:20 UTC

Created attachment 1058374 [details]
gv-homes-AD.log showing me trying to mount the volume before and after each server update

Description of problem:

After I upgraded my GlusterFS FUSE client from 3.5.5 to 3.7.3, I unmounted the gluster volume (/gv-homes-AD). I then attempted to mount it multiple times but doing so failed.

This error was produced repeatedly in the log (/var/log/glusterfs/gv-homes-AD.log):
E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f169c7e33fa] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f169c5ae4af] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f169c5ae5ce] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f169c5afd9c] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f169c5b0548] ))))) 0-gv-homes-AD-client-0: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2015-08-01 16:37:27.513395 (xid=0x3)

Both Gluster servers for the 2-brick replicated volume (I am about to make the client machine a 3rd replica) were running 3.7.2. JoeJulian on IRC identified this as likely an RPC bug.

After I updated one of the servers to 3.7.3, the client was able to mount the volume, but the errors persisted. (The 1st machine to be updated, death-star, was specified in /etc/fstab).

After I updated the 2nd server, the errors went away. So I am good to go now, but I am reporting this as a bug/issue.

Version-Release number of selected component (if applicable):
Client is F21 x86_64.
It was running 3.5.5-2.fc21.x86_64 (provided by Fedora.)
Then I updated it to  3.7.3-1.fc21 from download.gluster.org.

Both servers are RHEL 7.1 x86_64.
They were running 3.7.21.el7 from download.gluster.org.
Then I updated them to 3.7.3-1.el7 from download.gluster.org.

How reproducible:
The mount command failed multiple times in a row, but I fixed the issue by updating the servers.

/etc/fstab line:
death-star:/gv-homes-AD /gv-homes-AD glusterfs acl 0 0

Comment 1 Raghavendra G 2015-08-03 06:19:18 UTC

Is there any firewall blocking the data? Can you give the output of,

# iptables -L

from all the bricks and client node?

Comment 2 Michael DePaulo 2015-08-03 13:25:55 UTC

Sorry, I have already updated all 3 machines.

I think I had the firewall disabled on all of them at the time. I know that I later verified that the firewall was disabled on all of them by running that command.

Comment 3 Bob Gomez 2015-08-03 14:48:50 UTC

We are seeing the same issue.  The 3.7.3 client to a 3.7.2 server fails to mount.  I didn't see any log messages on the server side.

Log From Client:
[2015-08-03 14:36:55.678451] I [MSGID: 100030] [glusterfsd.c:2301:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.3 (args: /usr/sbin/glusterfs --volfile-server=gluster1 --volfile-server=gluster2 --volfile-id=general1 /gluster/general1)
[2015-08-03 14:36:55.689039] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-08-03 14:36:55.726060] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 10.x.x.x:24007 failed (No data available)
[2015-08-03 14:36:55.726894] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f084bb7bf46] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f084b94c54e] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f084b94c65e] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f084b94df1c] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f084b94e6b8] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2015-08-03 14:36:55.707046 (xid=0x1)
[2015-08-03 14:36:55.726988] E [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:general1)
[2015-08-03 14:36:55.727131] W [glusterfsd.c:1219:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x205) [0x7f084b94c575] -->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x400) [0x7f084c051f00] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f084c04c589] ) 0-: received signum (0), shutting down
[2015-08-03 14:36:55.727252] I [fuse-bridge.c:5595:fini] 0-fuse: Unmounting '/gluster/general1'.
[2015-08-03 14:36:55.740108] W [glusterfsd.c:1219:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f084b30f182] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f084c04c6f5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f084c04c589] ) 0-: received signum (15), shutting down

Comment 4 Bob Gomez 2015-08-03 14:49:56 UTC

Downgrading to the client to 3.7.2 fixes the issue.

Comment 5 Brian Sipos 2016-08-08 21:49:58 UTC

I just ran into this same issue with oVirt 4.0 on CentOS 7 when tryng to connect a new host into an existing Gluster storage domain.

If I manually mount with client at 3.7.3, then I see the "failed to fetch volume file" error. I was able to manually downgrade the client to 3.7.2 (as suggested in comment#4) and this issue is not seen.

This is now a one-year-old issue which effects new hosts using the current, just-released version of oVirt.

Comment 6 Kaushal 2017-03-08 10:49:39 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.