Description of problem: ======================== Had a 4node cluster with the GA'ed 3.2 build. Created 3 volumes -ozone (2*2), disp(2*4+2), and dist (2*1). Had all the volumes mounted on client C1 over fuse, and some files created. Upgraded the server to 3.3 (3.8.4-22), enabled eventing, enabled performance.parallel-readdir on volumes ozone and dist. The clients _continued_ to be 3.2. Existing mountpoints (even after remounting) stopped working. The volume on which the volume option parallel-readdir was not enabled, disp, is accessible via the existing mount. A fresh mount of the same volumes on 3.3 clients work. I suppose the new volume graph created after enabling this volume option is not being recognised by clients of older gluster version. Sosreports copied at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/ Version-Release number of selected component (if applicable): ============================================================= 3.8.4-22 How reproducible: =================== 2:2 Steps to Reproduce: =================== 1. Have server and client of glusterfs 3.2.0 with a volume created and mounted. 2. Upgrade the server to 3.3.0. Do NOT update the client 3. Enable performance.parallel-readdir and try to access the same volume from an existing mountpoint of 3.2 client Actual results: ============== Any command when given, hangs. When performance.parallel-readdir is disabled on the volume, it errors out with: "Transport endpoint not connected" Client logs: ----------- [2017-04-08 02:31:33.443345] I [fuse-bridge.c:4153:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.22 [2017-04-08 02:40:01.960990] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-ozone-client-1: server 10.70.47.164:49152 has not responded in the last 42 seconds, disconnecting. [2017-04-08 02:41:48.549405] I [socket.c:3446:socket_submit_request] 0-ozone-client-1: not connected (priv->connected = -1) [2017-04-08 02:41:48.549462] W [rpc-clnt.c:1692:rpc_clnt_submit] 0-ozone-client-1: failed to submit rpc-request (XID: 0x14 Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (ozone-client-1) [2017-04-08 02:41:48.549481] W [MSGID: 114031] [client-rpc-fops.c:2938:client3_3_lookup_cbk] 0-ozone-client-1: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected] [2017-04-08 02:41:49.569170] W [rpc-clnt.c:1692:rpc_clnt_submit] 0-ozone-client-1: failed to submit rpc-request (XID: 0x15 Program: GlusterFS 3.3, ProgVers: 330, Proc: 20) to rpc-transport (ozone-client-1) [2017-04-08 02:41:49.569204] E [MSGID: 114031] [client-rpc-fops.c:2847:client3_3_opendir_cbk] 0-ozone-client-1: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected] Additional info: ================= [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# gluster peer status Number of Peers: 3 Hostname: dhcp47-164.lab.eng.blr.redhat.com Uuid: afa697a0-2cc6-4705-892e-f5ec56a9f9de State: Peer in Cluster (Connected) Hostname: dhcp47-162.lab.eng.blr.redhat.com Uuid: 95491d39-d83a-4053-b1d5-682ca7290bd2 State: Peer in Cluster (Connected) Hostname: dhcp47-157.lab.eng.blr.redhat.com Uuid: d0955c85-94d0-41ba-aea8-1ffde3575ea5 State: Peer in Cluster (Connected) [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# gluster v list disp dist ozone [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# gluster v info Volume Name: disp Type: Distributed-Disperse Volume ID: d7f56851-61a5-4211-8f3f-1c000a68eced Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: 10.70.47.165:/bricks/brick0/disp_0 Brick2: 10.70.47.164:/bricks/brick0/disp_1 Brick3: 10.70.47.162:/bricks/brick0/disp_2 Brick4: 10.70.47.157:/bricks/brick0/disp_3 Brick5: 10.70.47.165:/bricks/brick1/disp_4 Brick6: 10.70.47.164:/bricks/brick1/disp_5 Brick7: 10.70.47.162:/bricks/brick1/disp_6 Brick8: 10.70.47.157:/bricks/brick1/disp_7 Brick9: 10.70.47.165:/bricks/brick2/disp_8 Brick10: 10.70.47.164:/bricks/brick2/disp_9 Brick11: 10.70.47.162:/bricks/brick2/disp_10 Brick12: 10.70.47.157:/bricks/brick2/disp_11 Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet features.bitrot: on features.scrub: Active features.scrub-freq: hourly Volume Name: dist Type: Distribute Volume ID: f9571010-d72a-4cec-a12c-f2819bf12c04 Status: Started Snapshot Count: 0 Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 10.70.47.165:/bricks/brick0/dist_0 Brick2: 10.70.47.164:/bricks/brick0/dist_1 Options Reconfigured: performance.parallel-readdir: on nfs.disable: on performance.readdir-ahead: on transport.address-family: inet features.bitrot: on features.scrub: Active features.scrub-freq: hourly Volume Name: ozone Type: Distributed-Replicate Volume ID: 8b736150-4fdd-4f00-9446-4ae89920f63b Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.47.165:/bricks/brick0/ozone_0 Brick2: 10.70.47.164:/bricks/brick0/ozone_1 Brick3: 10.70.47.162:/bricks/brick0/ozone_2 Brick4: 10.70.47.157:/bricks/brick0/ozone_3 Options Reconfigured: performance.parallel-readdir: on nfs.disable: on performance.readdir-ahead: on transport.address-family: inet features.bitrot: on features.scrub: Active features.scrub-freq: hourly [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# gluster v status Status of volume: disp Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.165:/bricks/brick0/disp_0 49154 0 Y 29903 Brick 10.70.47.164:/bricks/brick0/disp_1 49154 0 Y 23450 Brick 10.70.47.162:/bricks/brick0/disp_2 49153 0 Y 23257 Brick 10.70.47.157:/bricks/brick0/disp_3 49153 0 Y 23300 Brick 10.70.47.165:/bricks/brick1/disp_4 49155 0 Y 29922 Brick 10.70.47.164:/bricks/brick1/disp_5 49155 0 Y 23469 Brick 10.70.47.162:/bricks/brick1/disp_6 49154 0 Y 23276 Brick 10.70.47.157:/bricks/brick1/disp_7 49154 0 Y 23319 Brick 10.70.47.165:/bricks/brick2/disp_8 49156 0 Y 29942 Brick 10.70.47.164:/bricks/brick2/disp_9 49156 0 Y 23489 Brick 10.70.47.162:/bricks/brick2/disp_10 49155 0 Y 23296 Brick 10.70.47.157:/bricks/brick2/disp_11 49155 0 Y 23339 Self-heal Daemon on localhost N/A N/A Y 29966 Bitrot Daemon on localhost N/A N/A Y 29977 Scrubber Daemon on localhost N/A N/A Y 29989 Self-heal Daemon on dhcp47-164.lab.eng.blr. redhat.com N/A N/A Y 23513 Bitrot Daemon on dhcp47-164.lab.eng.blr.red hat.com N/A N/A Y 23524 Scrubber Daemon on dhcp47-164.lab.eng.blr.r edhat.com N/A N/A Y 23536 Self-heal Daemon on dhcp47-162.lab.eng.blr. redhat.com N/A N/A Y 23318 Bitrot Daemon on dhcp47-162.lab.eng.blr.red hat.com N/A N/A Y 23328 Scrubber Daemon on dhcp47-162.lab.eng.blr.r edhat.com N/A N/A Y 23339 Self-heal Daemon on dhcp47-157.lab.eng.blr. redhat.com N/A N/A Y 23363 Bitrot Daemon on dhcp47-157.lab.eng.blr.red hat.com N/A N/A Y 23373 Scrubber Daemon on dhcp47-157.lab.eng.blr.r edhat.com N/A N/A Y 23384 Task Status of Volume disp ------------------------------------------------------------------------------ There are no active volume tasks Another transaction is in progress for dist. Please try again after sometime. Status of volume: ozone Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.165:/bricks/brick0/ozone_0 49152 0 Y 28852 Brick 10.70.47.164:/bricks/brick0/ozone_1 49152 0 Y 22614 Brick 10.70.47.162:/bricks/brick0/ozone_2 49152 0 Y 22428 Brick 10.70.47.157:/bricks/brick0/ozone_3 49152 0 Y 22576 Self-heal Daemon on localhost N/A N/A Y 29966 Bitrot Daemon on localhost N/A N/A Y 29977 Scrubber Daemon on localhost N/A N/A Y 29989 Self-heal Daemon on dhcp47-164.lab.eng.blr. redhat.com N/A N/A Y 23513 Bitrot Daemon on dhcp47-164.lab.eng.blr.red hat.com N/A N/A Y 23524 Scrubber Daemon on dhcp47-164.lab.eng.blr.r edhat.com N/A N/A Y 23536 Self-heal Daemon on dhcp47-162.lab.eng.blr. redhat.com N/A N/A Y 23318 Bitrot Daemon on dhcp47-162.lab.eng.blr.red hat.com N/A N/A Y 23328 Scrubber Daemon on dhcp47-162.lab.eng.blr.r edhat.com N/A N/A Y 23339 Self-heal Daemon on dhcp47-157.lab.eng.blr. redhat.com N/A N/A Y 23363 Bitrot Daemon on dhcp47-157.lab.eng.blr.red hat.com N/A N/A Y 23373 Scrubber Daemon on dhcp47-157.lab.eng.blr.r edhat.com N/A N/A Y 23384 Task Status of Volume ozone ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp47-165 ~]# [root@dhcp47-165 ~]# rpm -qa | grep gluster glusterfs-libs-3.8.4-22.el7rhgs.x86_64 glusterfs-cli-3.8.4-22.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-22.el7rhgs.x86_64 glusterfs-rdma-3.8.4-22.el7rhgs.x86_64 vdsm-gluster-4.17.33-1.1.el7rhgs.noarch glusterfs-3.8.4-22.el7rhgs.x86_64 glusterfs-api-3.8.4-22.el7rhgs.x86_64 glusterfs-events-3.8.4-22.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch gluster-nagios-addons-0.2.8-1.el7rhgs.x86_64 glusterfs-fuse-3.8.4-22.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-22.el7rhgs.x86_64 glusterfs-server-3.8.4-22.el7rhgs.x86_64 python-gluster-3.8.4-22.el7rhgs.noarch [root@dhcp47-165 ~]#
[qe@rhsqe-repo 1441992]$ hostname rhsqe-repo.lab.eng.blr.redhat.com [qe@rhsqe-repo 1441992]$ [qe@rhsqe-repo 1441992]$ pwd /home/repo/sosreports/1441992 [qe@rhsqe-repo 1441992]$ [qe@rhsqe-repo 1441992]$ ll total 49308 -rwxr-xr-x. 1 qe qe 12584772 Apr 13 14:33 sosreport-dhcp47-157-sysreg-prod-20170413034037.tar.xz -rwxr-xr-x. 1 qe qe 12578400 Apr 13 14:33 sosreport-dhcp47-162-sysreg-prod-20170413034025.tar.xz -rwxr-xr-x. 1 qe qe 12627140 Apr 13 14:34 sosreport-dhcp47-164-sysreg-prod-20170413034018.tar.xz -rwxr-xr-x. 1 qe qe 12697080 Apr 13 14:33 sosreport-dhcp47-165-sysreg-prod-20170413034012.tar.xz [qe@rhsqe-repo 1441992]$
RCA: The 3.2 op-version is 31001, and parallel readdir op-version is 31000. But the 3.2 code doesn't recognize parallel-readdir feature and still we will be able to enable the feature, because the 3.2 opversion is greater than parallel readdir's opversion. Ideally we should pick features of higher opversion in one release and then in the next release we pick a feature of lower opversion. We shouldn't allow enabling parallel readdir until: - cluster op-version is that of 3.3 - all the clients and servers are upgraded to 3.3 - this condition is what is breaking as the parallel readdir opversion is lower than that of 3.2. If we allow setting parallel-readdir when there are older clients, then the older clients might crash or fail to mount as they do not understand the new feature "parallel-readdir". So the bug would be, setting "parallel-readdir on" is working even when older clients are connected, but it is expected to fail. Possible solution: Increase the op-version of parallel-readdir to be > 31001 only in downstream. Will wait for more discussion with glusterd team before arriving at the solution
(In reply to Poornima G from comment #3) > RCA: > > The 3.2 op-version is 31001, and parallel readdir op-version is 31000. But > the 3.2 code doesn't recognize parallel-readdir feature and still we will be > able to enable the feature, because the 3.2 opversion is greater than > parallel readdir's opversion. Ideally we should pick features of higher > opversion in one release and then in the next release we pick a feature of > lower opversion. > > We shouldn't allow enabling parallel readdir until: > - cluster op-version is that of 3.3 > - all the clients and servers are upgraded to 3.3 - this condition is what > is breaking as the parallel readdir opversion is lower than that of 3.2. > If we allow setting parallel-readdir when there are older clients, then the > older clients might crash or fail to mount as they do not understand the new > feature "parallel-readdir". > > So the bug would be, setting "parallel-readdir on" is working even when > older clients are connected, but it is expected to fail. > > Possible solution: > Increase the op-version of parallel-readdir to be > 31001 only in downstream. > Will wait for more discussion with glusterd team before arriving at the > solution Kaushal, I don't have any other solution in mind apart from what Poornima is referring at. Do you think we can handle this in any other way. Unfortunately this will again lead us in diverging the op-versions between upstream and downstream.
I cannot think of other way. We will need to diverge op-versions again. The original intent of syncing op-versions across upstream and downstream was to allow upstream clients to use downstream volumes. Diverging will break this, and I guess we're okay with that. But now, we'll need someone to track these changes between upstream and downstream, and make sure these changes are done whenever we fork a downstream branch from upstream.
(In reply to Kaushal from comment #7) > I cannot think of other way. We will need to diverge op-versions again. > > The original intent of syncing op-versions across upstream and downstream > was to allow upstream clients to use downstream volumes. Diverging will > break this, and I guess we're okay with that. > > But now, we'll need someone to track these changes between upstream and > downstream, and make sure these changes are done whenever we fork a > downstream branch from upstream. Yes, that can be taken care by DOWNSTREAM ONLY tag in the downstream patches.
Patch posted at https://code.engineering.redhat.com/gerrit/104403
Build : 3.8.4-28 Followed the steps mentioned in the description. Umount and remount after server upgrade to 3.3 is working fine. Pumped IOs from the mount point and tried to access the files from the client. Was successfully able to access the files without any hangs. Hence, moving this bug to verfied.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774