Description of problem: ======================= Had a 6node cluster with the build glusterfs-6.0-1 and a n*3 volume 'testvol' created. Op-version was set to 60000. Mounted it over a client with glusterfs-3.12.2-47 bits and that failed with the error --> "glusterfs: failed to get the 'volume file' from server....Server is operating at an op-version which is not supported." Updated the client bits to glusterfs-6.0-1 and the mount was successful. Version-Release number of selected component (if applicable): ============================================================ # rpm -qa | grep gluster glusterfs-cli-6.0-1.el7rhgs.x86_64 glusterfs-cloudsync-plugins-6.0-1.el7rhgs.x86_64 tmp-rhs-tests-beaker-rhs-gluster-qe-libs-dev-bturner-3.0-0.noarch python2-gluster-6.0-1.el7rhgs.x86_64 glusterfs-geo-replication-6.0-1.el7rhgs.x86_64 glusterfs-6.0-1.el7rhgs.x86_64 glusterfs-api-6.0-1.el7rhgs.x86_64 glusterfs-devel-6.0-1.el7rhgs.x86_64 glusterfs-client-xlators-6.0-1.el7rhgs.x86_64 glusterfs-fuse-6.0-1.el7rhgs.x86_64 glusterfs-events-6.0-1.el7rhgs.x86_64 glusterfs-rdma-6.0-1.el7rhgs.x86_64 glusterfs-thin-arbiter-6.0-1.el7rhgs.x86_64 glusterfs-libs-6.0-1.el7rhgs.x86_64 glusterfs-server-6.0-1.el7rhgs.x86_64 glusterfs-debuginfo-6.0-1.el7rhgs.x86_64 # How reproducible: ================= Always Steps to Reproduce: ================== 1. Have a n (n>1) node cluster with RHGS 3.5 interim build 2. Mount it over the current live RHGS 3.4 (BU4) client - glusterfs-3.12.2-47 Actual results: ================ Mount fails. # mount -t glusterfs gqas001.sbu.lab.eng.bos.redhat.com:testvol /mnt/tmp/ Mount failed. Please check the log file for more details. Expected results: ================= n-1 (and n-2) compatibility should not break for RHGS 3.5. Additional info: ================ Not attaching sosreports, as these are perf machines and it is going to be on the heavier side, for a relatively straight-forward issue.. Please do let me know if it is required, and I'll work on uploading the same. Client logs: ------------ [root@dhcp46-85 ~]# rpm -qa | grep gluster glusterfs-libs-3.12.2-47.el7.x86_64 glusterfs-client-xlators-3.12.2-47.el7.x86_64 glusterfs-3.12.2-47.el7.x86_64 glusterfs-fuse-3.12.2-47.el7.x86_64 [root@dhcp46-85 ~]# [root@dhcp46-85 yum.repos.d]# yum repolist Loaded plugins: product-id, search-disabled-repos, subscription-manager rh-gluster-3-client-for-rhel-7-server-rpms | 4.0 kB 00:00:00 rhel-7-server-rpms | 3.4 kB 00:00:00 (1/6): rh-gluster-3-client-for-rhel-7-server-rpms/7Server/x86_64/group | 124 B 00:00:01 (2/6): rh-gluster-3-client-for-rhel-7-server-rpms/7Server/x86_64/updateinfo | 87 kB 00:00:01 (3/6): rh-gluster-3-client-for-rhel-7-server-rpms/7Server/x86_64/primary_db | 120 kB 00:00:01 (4/6): rhel-7-server-rpms/7Server/x86_64/group | 774 kB 00:00:02 (5/6): rhel-7-server-rpms/7Server/x86_64/updateinfo | 3.0 MB 00:00:02 (6/6): rhel-7-server-rpms/7Server/x86_64/primary_db | 54 MB 00:00:07 repo id repo name status rh-gluster-3-client-for-rhel-7-server-rpms/7Server/x86_64 Red Hat Storage Native Client for RHEL 7 (RPMs) 252 rhel-7-server-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 Server (RPMs) 23,926 repolist: 24,178 [root@dhcp46-85 yum.repos.d]# [root@dhcp46-85 yum.repos.d]# cd [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# mkdir /mnt/tmp [root@dhcp46-85 ~]# ping gqas001.sbu.lab.eng.bos.redhat.com PING gqas001.sbu.lab.eng.bos.redhat.com (10.16.156.0) 56(84) bytes of data. 64 bytes from gqas001.sbu.lab.eng.bos.redhat.com (10.16.156.0): icmp_seq=1 ttl=55 time=245 ms 64 bytes from gqas001.sbu.lab.eng.bos.redhat.com (10.16.156.0): icmp_seq=2 ttl=55 time=245 ms ^C --- gqas001.sbu.lab.eng.bos.redhat.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 245.895/245.933/245.972/0.497 ms [root@dhcp46-85 ~]# mount -t glusterfs gqas001.sbu.lab.eng.bos.redhat.com:testvol /mnt/tmp/ Mount failed. Please check the log file for more details. [root@dhcp46-85 ~]# vim /var/log/glusterfs/mnt-tmp.log -bash: vim: command not found [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# cat /var/log/glusterfs/mnt-tmp.log [2019-04-09 04:18:40.387646] I [MSGID: 100030] [glusterfsd.c:2646:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.2 (args: /usr/sbin/glusterfs --volfile-server=gqas001.sbu.lab.eng.bos.redhat.com --volfile-id=testvol /mnt/tmp) [2019-04-09 04:18:40.461797] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction [2019-04-09 04:18:40.478763] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-04-09 04:18:40.478852] I [MSGID: 101190] [event-epoll.c:676:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-04-09 04:18:40.971968] E [glusterfsd-mgmt.c:1925:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server [2019-04-09 04:18:40.972037] E [glusterfsd-mgmt.c:2051:mgmt_getspec_cbk] 0-mgmt: Server is operating at an op-version which is not supported [2019-04-09 04:18:40.974171] W [glusterfsd.c:1462:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90) [0x7f85a20c6a00] -->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x485) [0x563209920a55] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x563209919b2b] ) 0-: received signum (0), shutting down [2019-04-09 04:18:40.974297] I [fuse-bridge.c:6611:fini] 0-fuse: Unmounting '/mnt/tmp'. [2019-04-09 04:18:40.979281] I [fuse-bridge.c:6616:fini] 0-fuse: Closing fuse connection to '/mnt/tmp'. [2019-04-09 04:18:40.980417] W [glusterfsd.c:1462:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f85a115cdd5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x563209919cc5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x563209919b2b] ) 0-: received signum (15), shutting down [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# cat /etc/yum.repos.d/glusterfs-6.repo [local] name=glusterfs-6 baseurl=file:///home/glusterfs-6 enabled=1 gpgcheck=0 [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# yum update glusterfs glusterfs-fuse Loaded plugins: product-id, search-disabled-repos, subscription-manager local | 2.9 kB 00:00:00 local/primary_db | 7.3 kB 00:00:00 Resolving Dependencies --> Running transaction check ---> Package glusterfs.x86_64 0:3.12.2-47.el7 will be updated ---> Package glusterfs.x86_64 0:6.0-1.el7 will be an update --> Processing Dependency: glusterfs-libs(x86-64) = 6.0-1.el7 for package: glusterfs-6.0-1.el7.x86_64 ---> Package glusterfs-fuse.x86_64 0:3.12.2-47.el7 will be updated ---> Package glusterfs-fuse.x86_64 0:6.0-1.el7 will be an update --> Processing Dependency: glusterfs-client-xlators(x86-64) = 6.0-1.el7 for package: glusterfs-fuse-6.0-1.el7.x86_64 --> Running transaction check ---> Package glusterfs-client-xlators.x86_64 0:3.12.2-47.el7 will be updated ---> Package glusterfs-client-xlators.x86_64 0:6.0-1.el7 will be an update ---> Package glusterfs-libs.x86_64 0:3.12.2-47.el7 will be updated ---> Package glusterfs-libs.x86_64 0:6.0-1.el7 will be an update --> Finished Dependency Resolution Dependencies Resolved ======================================================================================================================================================================== Package Arch Version Repository Size ======================================================================================================================================================================== Updating: glusterfs x86_64 6.0-1.el7 local 599 k glusterfs-fuse x86_64 6.0-1.el7 local 122 k Updating for dependencies: glusterfs-client-xlators x86_64 6.0-1.el7 local 825 k glusterfs-libs x86_64 6.0-1.el7 local 390 k Transaction Summary ======================================================================================================================================================================== Upgrade 2 Packages (+2 Dependent packages) Total download size: 1.9 M Is this ok [y/d/N]: y Downloading packages: ... ... ... Complete! [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# rpm -qa | grep gluster glusterfs-6.0-1.el7.x86_64 glusterfs-libs-6.0-1.el7.x86_64 glusterfs-client-xlators-6.0-1.el7.x86_64 glusterfs-fuse-6.0-1.el7.x86_64 [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# mkdir /mnt/tmpNew [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# mount -t glusterfs gqas001.sbu.lab.eng.bos.redhat.com:testvol /mnt/tmpNew/ [root@dhcp46-85 ~]# [root@dhcp46-85 ~]# vi /var/log/glusterfs/mnt-tmpNew.log [root@dhcp46-85 ~]# mount | grep gluster gqas001.sbu.lab.eng.bos.redhat.com:testvol on /mnt/tmpNew type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) [root@dhcp46-85 ~]# Server logs: ------------- [root@gqas001 ~]# rpm -qa | grep gluster glusterfs-cli-6.0-1.el7rhgs.x86_64 glusterfs-cloudsync-plugins-6.0-1.el7rhgs.x86_64 tmp-rhs-tests-beaker-rhs-gluster-qe-libs-dev-bturner-3.0-0.noarch python2-gluster-6.0-1.el7rhgs.x86_64 glusterfs-geo-replication-6.0-1.el7rhgs.x86_64 glusterfs-6.0-1.el7rhgs.x86_64 glusterfs-api-6.0-1.el7rhgs.x86_64 glusterfs-devel-6.0-1.el7rhgs.x86_64 glusterfs-client-xlators-6.0-1.el7rhgs.x86_64 glusterfs-fuse-6.0-1.el7rhgs.x86_64 glusterfs-events-6.0-1.el7rhgs.x86_64 glusterfs-rdma-6.0-1.el7rhgs.x86_64 glusterfs-thin-arbiter-6.0-1.el7rhgs.x86_64 glusterfs-libs-6.0-1.el7rhgs.x86_64 glusterfs-server-6.0-1.el7rhgs.x86_64 glusterfs-debuginfo-6.0-1.el7rhgs.x86_64 [root@gqas001 ~]# [root@gqas001 ~]# gluster pool list UUID Hostname State 825da299-4e10-4a93-9f26-c30b6c49f1c9 gqas004.sbu.lab.eng.bos.redhat.com Connected 2466fcd1-78f0-4d66-bd18-28fed503e504 gqas009.sbu.lab.eng.bos.redhat.com Connected 528c8eae-9a54-4394-bcab-566495cc5a68 gqas010.sbu.lab.eng.bos.redhat.com Connected e2a31da0-e9f0-479b-89ec-6dc4e316d299 gqas012.sbu.lab.eng.bos.redhat.com Connected a9380115-b30b-475e-a22d-e69bee8a92d9 gqas014.sbu.lab.eng.bos.redhat.com Connected 4b69a70d-b81c-4aff-aa62-643b7a62b135 localhost Connected [root@gqas001 ~]# [root@gqas001 ~]# gluster v get all all Option Value ------ ----- cluster.server-quorum-ratio 51 cluster.enable-shared-storage disable cluster.op-version 60000 cluster.max-op-version 60000 cluster.brick-multiplex disable cluster.max-bricks-per-process 250 cluster.daemon-log-level INFO [root@gqas001 ~]# [root@gqas001 ~]# gluster v info Volume Name: testvol Type: Distributed-Replicate Volume ID: 02e345d4-7567-4bdb-83c0-698cb70f275d Status: Started Snapshot Count: 0 Number of Bricks: 24 x 3 = 72 Transport-type: tcp Bricks: Brick1: gqas001.sbu.lab.eng.bos.redhat.com:/gluster/brick1/testvol Brick2: gqas004.sbu.lab.eng.bos.redhat.com:/gluster/brick1/testvol ... ... ... Brick70: gqas010.sbu.lab.eng.bos.redhat.com:/gluster/brick12/testvol Brick71: gqas012.sbu.lab.eng.bos.redhat.com:/gluster/brick12/testvol Brick72: gqas014.sbu.lab.eng.bos.redhat.com:/gluster/brick12/testvol Options Reconfigured: performance.cache-samba-metadata: on server.event-threads: 4 client.event-threads: 4 cluster.lookup-optimize: on network.inode-lru-limit: 90000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet nfs.disable: off performance.client-io-threads: off [root@gqas001 ~]# [root@gqas001 ~]# cat /var/log/glusterfs/glusterd.log ... ... [2019-04-09 03:59:15.632766] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-09 03:59:15.635367] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-09 04:18:40.848223] I [MSGID: 106022] [glusterd-handshake.c:868:_client_supports_volume] 0-glusterd: Client 10.70.46.85:1023 (1 -> 31305) doesn't support required op-version (40000). Rejecting volfile request. [Operation not supported] [2019-04-09 07:30:50.013756] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-09 07:30:50.016410] I [MSGID: 106488] [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2019-04-09 07:31:36.587161] I [MSGID: 106487] [glusterd-handler.c:1498:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2019-04-09 07:32:00.278177] E [MSGID: 106061] [glusterd-utils.c:10290:glusterd_max_opversion_use_rsp_dict] 0-management: Maximum supported op-version not set in destination dictionary ... ... [root@gqas001 ~]#
There're two parts to this problem. What you have hit is explained in (1), but if we fix only (1) you would end up hitting (2) , so we need to fix both. The root cause is self explanatory from the commit messages but I'll paste here too 1. upstream patch : https://review.gluster.org/#/c/22539/ (we still need to get an agreement on this approach) With group-metadata-cache group profile settings performance.cache-invalidation option when turned on enables both md-cache and quick-read xlator's cache-invalidation feature. While the intent of the group-metadata-cache is to set md-cache xlator's cache-invalidation feature, quick-read xlator also gets affected due to the same. While md-cache feature and it's profile existed since release-3.9, quick-read cache-invalidation was introduced in release-4 and due to this op-version mismatch on any cluster which is >= glusterfs-4 when this group profile is applied it breaks backward compatibility with the old clients. The proposed fix here is to rename the key in quick-read to 'quick-read-cache-invalidation' so that both these features have distinct identification. While this brings in by itself a backward compatibility challenge where this feature is enabled in an existing cluster and when the same is upgraded to a version where this change exists, it will lead to an unidentified old key. But as a workaround we can always ask users upgrading to release-7 version to turn off this option, upgrade the cluster and turn it back on with the new key. This needs to be documented once the patch is accepted. 2. upstream patch : https://review.gluster.org/22536 Considering ctime is a client side feature, we can't blindly load ctime xlator into the client graph if it's explicitly turned off, that'd result into backward compatibility issue where an old client can't mount a volume configured on a server which is having ctime feature.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:3249