1598746 – sporadic timeout of 'gluster v status' command during EC volume during inservice upgrade

Bug 1598746 - sporadic timeout of 'gluster v status' command during EC volume during inservice upgrade

Summary: sporadic timeout of 'gluster v status' command during EC volume during inser...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Sanju
QA Contact:	Bala Konda Reddy M
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1558948
TreeView+	depends on / blocked

Reported:	2018-07-06 11:27 UTC by Upasana
Modified:	2018-12-04 16:39 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-12-04 16:39:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Upasana 2018-07-06 11:27:18 UTC

Description of problem:
======================
Gluster v status times out while running inservice upgrade


Version-Release number of selected component (if applicable):
============================================================
glusterfs-server-3.12.2-13.el7rhgs.x86_64
glusterfs-server-3.8.4-54.14.el7rhgs.x86_64


How reproducible:
================
Inconsistent 

Steps to Reproduce:
==================

Gluster v status times out while running inservice upgrade
Inservice upgrade was in progress
1.Upgraded 2 nodes from 3.3.1 to 3.4.0
2.Healing was going on
3.Checked gluster v status from the node which was yet to be upgraded
[root@dhcp35-18 ~]# time gluster v status dispersed
Error : Request timed out

real	2m0.992s
user	0m0.101s
sys	0m0.091s

After sometime gluster v status working fine without any sluggish-ness
Actual results:
===============

gluster v status should not hang


Additional info:
Logs and sosreport updated at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/ubansal/vstatus/

Comment 2 Upasana 2018-07-06 11:37:02 UTC

Setup Info - 
Had created an EC volume with below settings and mounted on 2 clients
[root@dhcp35-122 yum.repos.d]# gluster v info
 
Volume Name: dispersed
Type: Distributed-Disperse
Volume ID: fb968754-610e-408b-8217-840038992694
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.35.18:/gluster/brick1/dist-dispersed
Brick2: 10.70.35.57:/gluster/brick1/dist-dispersed
Brick3: 10.70.35.131:/gluster/brick1/dist-dispersed
Brick4: 10.70.35.66:/gluster/brick1/dist-dispersed
Brick5: 10.70.35.94:/gluster/brick1/dist-dispersed
Brick6: 10.70.35.122:/gluster/brick1/dist-dispersed
Brick7: 10.70.35.18:/gluster/brick2/dist-dispersed
Brick8: 10.70.35.57:/gluster/brick2/dist-dispersed
Brick9: 10.70.35.131:/gluster/brick2/dist-dispersed
Brick10: 10.70.35.66:/gluster/brick2/dist-dispersed
Brick11: 10.70.35.94:/gluster/brick2/dist-dispersed
Brick12: 10.70.35.122:/gluster/brick2/dist-dispersed
Options Reconfigured:
disperse.shd-max-threads: 64
disperse.optimistic-change-log: off
diagnostics.client-log-level: DEBUG
disperse.eager-lock: off
transport.address-family: inet
nfs.disable: on
[root@dhcp35-122 yum.repos.d]# 

From one client was running untar of a linux file and from another running dd IO's

Comment 23 Atin Mukherjee 2018-10-12 06:26:16 UTC

This doesn't look like reproducible. so what's the plan here?

Comment 30 Atin Mukherjee 2018-12-02 07:50:16 UTC

Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Comment 31 Upasana 2018-12-03 05:21:42 UTC

(In reply to Atin Mukherjee from comment #30)
> Upasana - Have you hit this in 3.4 BU1 to 3.4 BU2 upgrade path?

Hi Atin, 

I haven't hit it in 3.4 BU1 to 3.4 BU2 upgrade path

Note You need to log in before you can comment on or make changes to this bug.