Bug 1598746

Summary:	sporadic timeout of 'gluster v status' command during EC volume during inservice upgrade
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Upasana <ubansal>
Component:	glusterd	Assignee:	Sanju <srakonde>
Status:	CLOSED WORKSFORME	QA Contact:	Bala Konda Reddy M <bmekala>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	rhgs-3.4	CC:	amukherj, nchilaka, rhs-bugs, sankarshan, sheggodu, srakonde, storage-qa-internal, ubansal, vbellur
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-12-04 16:39:16 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1558948

Description Upasana 2018-07-06 11:27:18 UTC

Description of problem:
======================
Gluster v status times out while running inservice upgrade


Version-Release number of selected component (if applicable):
============================================================
glusterfs-server-3.12.2-13.el7rhgs.x86_64
glusterfs-server-3.8.4-54.14.el7rhgs.x86_64


How reproducible:
================
Inconsistent 

Steps to Reproduce:
==================

Gluster v status times out while running inservice upgrade
Inservice upgrade was in progress
1.Upgraded 2 nodes from 3.3.1 to 3.4.0
2.Healing was going on
3.Checked gluster v status from the node which was yet to be upgraded
[root@dhcp35-18 ~]# time gluster v status dispersed
Error : Request timed out

real	2m0.992s
user	0m0.101s
sys	0m0.091s

After sometime gluster v status working fine without any sluggish-ness
Actual results:
===============

gluster v status should not hang


Additional info:
Logs and sosreport updated at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/ubansal/vstatus/

Comment 2 Upasana 2018-07-06 11:37:02 UTC

Setup Info - 
Had created an EC volume with below settings and mounted on 2 clients
[root@dhcp35-122 yum.repos.d]# gluster v info
 
Volume Name: dispersed
Type: Distributed-Disperse
Volume ID: fb968754-610e-408b-8217-840038992694
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.35.18:/gluster/brick1/dist-dispersed
Brick2: 10.70.35.57:/gluster/brick1/dist-dispersed
Brick3: 10.70.35.131:/gluster/brick1/dist-dispersed
Brick4: 10.70.35.66:/gluster/brick1/dist-dispersed
Brick5: 10.70.35.94:/gluster/brick1/dist-dispersed
Brick6: 10.70.35.122:/gluster/brick1/dist-dispersed
Brick7: 10.70.35.18:/gluster/brick2/dist-dispersed
Brick8: 10.70.35.57:/gluster/brick2/dist-dispersed
Brick9: 10.70.35.131:/gluster/brick2/dist-dispersed
Brick10: 10.70.35.66:/gluster/brick2/dist-dispersed
Brick11: 10.70.35.94:/gluster/brick2/dist-dispersed
Brick12: 10.70.35.122:/gluster/brick2/dist-dispersed
Options Reconfigured:
disperse.shd-max-threads: 64
disperse.optimistic-change-log: off
diagnostics.client-log-level: DEBUG
disperse.eager-lock: off
transport.address-family: inet
nfs.disable: on
[root@dhcp35-122 yum.repos.d]# 

From one client was running untar of a linux file and from another running dd IO's

Comment 23 Atin Mukherjee 2018-10-12 06:26:16 UTC

This doesn't look like reproducible. so what's the plan here?

Comment 30 Atin Mukherjee 2018-12-02 07:50:16 UTC

Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Comment 31 Upasana 2018-12-03 05:21:42 UTC

(In reply to Atin Mukherjee from comment #30)
> Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Hi Atin, 

I haven't hit it in 3.4 BU1 to 3.4 BU2 upgrade path