Bug 1249339 - Gluster 3.7.3 client fails to mount volume on 3.7.2 servers
Gluster 3.7.3 client fails to mount volume on 3.7.2 servers
Product: GlusterFS
Classification: Community
Component: rpc (Show other bugs)
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: bugs@gluster.org
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2015-08-01 15:40 EDT by Michael DePaulo
Modified: 2017-03-08 05:49 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-03-08 05:49:39 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
gv-homes-AD.log showing me trying to mount the volume before and after each server update (129.82 KB, text/plain)
2015-08-01 15:40 EDT, Michael DePaulo
no flags Details

  None (edit)
Description Michael DePaulo 2015-08-01 15:40:20 EDT
Created attachment 1058374 [details]
gv-homes-AD.log showing me trying to mount the volume before and after each server update

Description of problem:

After I upgraded my GlusterFS FUSE client from 3.5.5 to 3.7.3, I unmounted the gluster volume (/gv-homes-AD). I then attempted to mount it multiple times but doing so failed.

This error was produced repeatedly in the log (/var/log/glusterfs/gv-homes-AD.log):
E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f169c7e33fa] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f169c5ae4af] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f169c5ae5ce] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f169c5afd9c] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f169c5b0548] ))))) 0-gv-homes-AD-client-0: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2015-08-01 16:37:27.513395 (xid=0x3)

Both Gluster servers for the 2-brick replicated volume (I am about to make the client machine a 3rd replica) were running 3.7.2. JoeJulian on IRC identified this as likely an RPC bug.

After I updated one of the servers to 3.7.3, the client was able to mount the volume, but the errors persisted. (The 1st machine to be updated, death-star, was specified in /etc/fstab).

After I updated the 2nd server, the errors went away. So I am good to go now, but I am reporting this as a bug/issue.

Version-Release number of selected component (if applicable):
Client is F21 x86_64.
It was running 3.5.5-2.fc21.x86_64 (provided by Fedora.)
Then I updated it to  3.7.3-1.fc21 from download.gluster.org.

Both servers are RHEL 7.1 x86_64.
They were running 3.7.21.el7 from download.gluster.org.
Then I updated them to 3.7.3-1.el7 from download.gluster.org.

How reproducible:
The mount command failed multiple times in a row, but I fixed the issue by updating the servers.

/etc/fstab line:
death-star:/gv-homes-AD /gv-homes-AD glusterfs acl 0 0
Comment 1 Raghavendra G 2015-08-03 02:19:18 EDT
Is there any firewall blocking the data? Can you give the output of,

# iptables -L

from all the bricks and client node?
Comment 2 Michael DePaulo 2015-08-03 09:25:55 EDT
Sorry, I have already updated all 3 machines.

I think I had the firewall disabled on all of them at the time. I know that I later verified that the firewall was disabled on all of them by running that command.
Comment 3 Bob Gomez 2015-08-03 10:48:50 EDT
We are seeing the same issue.  The 3.7.3 client to a 3.7.2 server fails to mount.  I didn't see any log messages on the server side.

Log From Client:
[2015-08-03 14:36:55.678451] I [MSGID: 100030] [glusterfsd.c:2301:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.3 (args: /usr/sbin/glusterfs --volfile-server=gluster1 --volfile-server=gluster2 --volfile-id=general1 /gluster/general1)
[2015-08-03 14:36:55.689039] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-08-03 14:36:55.726060] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 10.x.x.x:24007 failed (No data available)
[2015-08-03 14:36:55.726894] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f084bb7bf46] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f084b94c54e] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f084b94c65e] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f084b94df1c] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f084b94e6b8] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2015-08-03 14:36:55.707046 (xid=0x1)
[2015-08-03 14:36:55.726988] E [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:general1)
[2015-08-03 14:36:55.727131] W [glusterfsd.c:1219:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x205) [0x7f084b94c575] -->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x400) [0x7f084c051f00] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f084c04c589] ) 0-: received signum (0), shutting down
[2015-08-03 14:36:55.727252] I [fuse-bridge.c:5595:fini] 0-fuse: Unmounting '/gluster/general1'.
[2015-08-03 14:36:55.740108] W [glusterfsd.c:1219:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f084b30f182] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f084c04c6f5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f084c04c589] ) 0-: received signum (15), shutting down
Comment 4 Bob Gomez 2015-08-03 10:49:56 EDT
Downgrading to the client to 3.7.2 fixes the issue.
Comment 5 Brian Sipos 2016-08-08 17:49:58 EDT
I just ran into this same issue with oVirt 4.0 on CentOS 7 when tryng to connect a new host into an existing Gluster storage domain.

If I manually mount with client at 3.7.3, then I see the "failed to fetch volume file" error. I was able to manually downgrade the client to 3.7.2 (as suggested in comment#4) and this issue is not seen.

This is now a one-year-old issue which effects new hosts using the current, just-released version of oVirt.
Comment 6 Kaushal 2017-03-08 05:49:39 EST
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.