Bug 1598746

Summary: sporadic timeout of 'gluster v status' command during EC volume during inservice upgrade
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Upasana <ubansal>
Component: glusterdAssignee: Sanju <srakonde>
Status: CLOSED WORKSFORME QA Contact: Bala Konda Reddy M <bmekala>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.4CC: amukherj, nchilaka, rhs-bugs, sankarshan, sheggodu, srakonde, storage-qa-internal, ubansal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-04 16:39:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1558948    

Description Upasana 2018-07-06 11:27:18 UTC
Description of problem:
======================
Gluster v status times out while running inservice upgrade


Version-Release number of selected component (if applicable):
============================================================
glusterfs-server-3.12.2-13.el7rhgs.x86_64
glusterfs-server-3.8.4-54.14.el7rhgs.x86_64


How reproducible:
================
Inconsistent 

Steps to Reproduce:
==================

Gluster v status times out while running inservice upgrade
Inservice upgrade was in progress
1.Upgraded 2 nodes from 3.3.1 to 3.4.0
2.Healing was going on
3.Checked gluster v status from the node which was yet to be upgraded
[root@dhcp35-18 ~]# time gluster v status dispersed
Error : Request timed out

real	2m0.992s
user	0m0.101s
sys	0m0.091s

After sometime gluster v status working fine without any sluggish-ness
Actual results:
===============

gluster v status should not hang


Additional info:
Logs and sosreport updated at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/ubansal/vstatus/

Comment 2 Upasana 2018-07-06 11:37:02 UTC
Setup Info - 
Had created an EC volume with below settings and mounted on 2 clients
[root@dhcp35-122 yum.repos.d]# gluster v info
 
Volume Name: dispersed
Type: Distributed-Disperse
Volume ID: fb968754-610e-408b-8217-840038992694
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.35.18:/gluster/brick1/dist-dispersed
Brick2: 10.70.35.57:/gluster/brick1/dist-dispersed
Brick3: 10.70.35.131:/gluster/brick1/dist-dispersed
Brick4: 10.70.35.66:/gluster/brick1/dist-dispersed
Brick5: 10.70.35.94:/gluster/brick1/dist-dispersed
Brick6: 10.70.35.122:/gluster/brick1/dist-dispersed
Brick7: 10.70.35.18:/gluster/brick2/dist-dispersed
Brick8: 10.70.35.57:/gluster/brick2/dist-dispersed
Brick9: 10.70.35.131:/gluster/brick2/dist-dispersed
Brick10: 10.70.35.66:/gluster/brick2/dist-dispersed
Brick11: 10.70.35.94:/gluster/brick2/dist-dispersed
Brick12: 10.70.35.122:/gluster/brick2/dist-dispersed
Options Reconfigured:
disperse.shd-max-threads: 64
disperse.optimistic-change-log: off
diagnostics.client-log-level: DEBUG
disperse.eager-lock: off
transport.address-family: inet
nfs.disable: on
[root@dhcp35-122 yum.repos.d]# 

From one client was running untar of a linux file and from another running dd IO's

Comment 23 Atin Mukherjee 2018-10-12 06:26:16 UTC
This doesn't look like reproducible. so what's the plan here?

Comment 30 Atin Mukherjee 2018-12-02 07:50:16 UTC
Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Comment 31 Upasana 2018-12-03 05:21:42 UTC
(In reply to Atin Mukherjee from comment #30)
> Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Hi Atin, 

I haven't hit it in 3.4 BU1 to 3.4 BU2 upgrade path