Description of problem: ====================== Gluster v status times out while running inservice upgrade Version-Release number of selected component (if applicable): ============================================================ glusterfs-server-3.12.2-13.el7rhgs.x86_64 glusterfs-server-3.8.4-54.14.el7rhgs.x86_64 How reproducible: ================ Inconsistent Steps to Reproduce: ================== Gluster v status times out while running inservice upgrade Inservice upgrade was in progress 1.Upgraded 2 nodes from 3.3.1 to 3.4.0 2.Healing was going on 3.Checked gluster v status from the node which was yet to be upgraded [root@dhcp35-18 ~]# time gluster v status dispersed Error : Request timed out real 2m0.992s user 0m0.101s sys 0m0.091s After sometime gluster v status working fine without any sluggish-ness Actual results: =============== gluster v status should not hang Additional info: Logs and sosreport updated at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/ubansal/vstatus/
Setup Info - Had created an EC volume with below settings and mounted on 2 clients [root@dhcp35-122 yum.repos.d]# gluster v info Volume Name: dispersed Type: Distributed-Disperse Volume ID: fb968754-610e-408b-8217-840038992694 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: 10.70.35.18:/gluster/brick1/dist-dispersed Brick2: 10.70.35.57:/gluster/brick1/dist-dispersed Brick3: 10.70.35.131:/gluster/brick1/dist-dispersed Brick4: 10.70.35.66:/gluster/brick1/dist-dispersed Brick5: 10.70.35.94:/gluster/brick1/dist-dispersed Brick6: 10.70.35.122:/gluster/brick1/dist-dispersed Brick7: 10.70.35.18:/gluster/brick2/dist-dispersed Brick8: 10.70.35.57:/gluster/brick2/dist-dispersed Brick9: 10.70.35.131:/gluster/brick2/dist-dispersed Brick10: 10.70.35.66:/gluster/brick2/dist-dispersed Brick11: 10.70.35.94:/gluster/brick2/dist-dispersed Brick12: 10.70.35.122:/gluster/brick2/dist-dispersed Options Reconfigured: disperse.shd-max-threads: 64 disperse.optimistic-change-log: off diagnostics.client-log-level: DEBUG disperse.eager-lock: off transport.address-family: inet nfs.disable: on [root@dhcp35-122 yum.repos.d]# From one client was running untar of a linux file and from another running dd IO's
This doesn't look like reproducible. so what's the plan here?
Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?
(In reply to Atin Mukherjee from comment #30) > Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path? Hi Atin, I haven't hit it in 3.4 BU1 to 3.4 BU2 upgrade path