Bug 1624698

Summary:	[Tracking BZ#1632719] With only 1 node down, multipath -ll shows multiple paths in "failed" state
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Neha Berry <nberry>
Component:	gluster-block	Assignee:	Prasanna Kumar Kalever <prasanna.kalever>
Status:	CLOSED ERRATA	QA Contact:	Neha Berry <nberry>
Severity:	high	Docs Contact:
Priority:	medium
Version:	cns-3.10	CC:	akrishna, amukherj, atumball, bgoyal, hchiramm, jahernan, kramdoss, madam, nberry, pkarampu, pprakash, prasanna.kalever, rcyriac, rhs-bugs, rtalur, sankarshan, vbellur, xiubli
Target Milestone:	---	Keywords:	ZStream
Target Release:	OCS 3.11.1
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.12.2-20	Doc Type:	Bug Fix
Doc Text:	Previously, after mounting the block volume if any brick of the block hosting volume goes down, then all multiple paths to the block volume used to enter fail state. This happens because when any given single brick is down, the backend glusterfs volume (BHV) response to IO requests takes too high (~14 mins), while the normal expected response time is 42 seconds. Thus, all applications utilizing this block volume would encounter Input-Output errors. With this fix, glusterfs block hosting volume's server.tcp-user-timeout is set to 42 sec by defaul	Story Points:	---
Clone Of:
Clones:	1632719 (view as bug list)		Environment:	Flags: devel_ack? → devel_ack+
Last Closed:	2019-02-07 03:38:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1623874
Bug Blocks:	1641915, 1644154

Description Neha Berry 2018-09-03 06:21:06 UTC

With only 1 node down, multipath -ll shows multiple paths in "failed" state
Description
================

We were regressing the fic of BZ#1623433 when this behavior was observed.

In a CNS 3.10 setup, we had a block pvc (block4-1) mounted on an app pod bkblock4-1-1-q7wxq on initiator node 10.70.46.145. The block device was created with HA=4.

Block device name = bk4_glusterfs_block4-1_107d2230-aeb1-11e8-80c4-0a580a83020
Block device iqn = iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849


Steps in short
======================

Step 1: Brought down a passive path - 10.70.46.169 and the single path which had failed, restored successfully on powering ON the node. No other paths went into failed state.

Step 2: POWERED OFF the active path node - 10.70.46.179 and observed that instead of a single path going to failed state, following things happened:

i) paths sdf(Active path for this pvc - 10.70.47.149) and path sdh(10.70.46.53- the node was UP) went into "Failed" state
ii) Path sdg became active
iii) then suddenly, sdg, sdf and sdi failed and sdh was restored on its own. IO continued from sdh

Thus, It looked like , the other 3 paths were also failing & re-instating even though only sdf was supposed to be down


Path status before any node poweroff/on
================================================

mpatha (36001405e9e0d58fe433463dba63113d4) dm-18 LIO-ORG ,TCMU device     
size=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 33:0:0:0 sdf 8:80  active ready running
|-+- policy='round-robin 0' prio=10 status=enabled
| `- 34:0:0:0 sdg 8:96  active ready running
|-+- policy='round-robin 0' prio=10 status=enabled
| `- 36:0:0:0 sdi 8:128 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 35:0:0:0 sdh 8:112 active ready running

[root@dhcp46-145 ~]# ll /dev/disk/by-path/ip*
lrwxrwxrwx. 1 root root 9 Sep  2 18:43 /dev/disk/by-path/ip-10.70.46.169:3260-iscsi-iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849-lun-0 -> ../../sdg
lrwxrwxrwx. 1 root root 9 Sep  2 18:43 /dev/disk/by-path/ip-10.70.46.53:3260-iscsi-iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849-lun-0 -> ../../sdh
lrwxrwxrwx. 1 root root 9 Sep  2 18:43 /dev/disk/by-path/ip-10.70.47.149:3260-iscsi-iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Sep  2 18:43 /dev/disk/by-path/ip-10.70.47.79:3260-iscsi-iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849-lun-0 -> ../../sdi

# iscsiadm -m session
tcp: [1] 10.70.47.149:3260,1 iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849 (non-flash)
tcp: [2] 10.70.46.169:3260,2 iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849 (non-flash)
tcp: [3] 10.70.46.53:3260,3 iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849 (non-flash)
tcp: [4] 10.70.47.79:3260,4 iqn.2016-12.org.gluster-block:e9e0d58f-e433-463d-ba63-113d40076849 (non-flash)




Path changes seen in matter of minutes once 10.70.47.149 was in POWERED OFF state
=========================================

i)

mpatha (36001405e9e0d58fe433463dba63113d4) dm-18 LIO-ORG ,TCMU device     
size=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 33:0:0:0 sdf 8:80  failed faulty running     <-----------------------Failed - expected
|-+- policy='round-robin 0' prio=10 status=active
| `- 34:0:0:0 sdg 8:96  active ready running
|-+- policy='round-robin 0' prio=10 status=enabled
| `- 36:0:0:0 sdi 8:128 active ready running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 35:0:0:0 sdh 8:112 failed faulty running <-----------------------Failed -unexpected


ii) 

[root@dhcp46-145 ~]# multipath -ll
mpatha (36001405e9e0d58fe433463dba63113d4) dm-18 LIO-ORG ,TCMU device     
size=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 33:0:0:0 sdf 8:80  failed faulty running        <-----------------------Failed state - expected
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 34:0:0:0 sdg 8:96  failed faulty running         <-----------------------Failed state- unexpected
|-+- policy='round-robin 0' prio=10 status=active
| `- 36:0:0:0 sdi 8:128 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 35:0:0:0 sdh 8:112 active ready running          <------------------------path state again changed from (i)-Failed to active

iii)

[root@dhcp46-145 ~]# multipath -ll
mpatha (36001405e9e0d58fe433463dba63113d4) dm-18 LIO-ORG ,TCMU device     
size=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 33:0:0:0 sdf 8:80  failed faulty running              <-----------------------Failed state - expected
|-+- policy='round-robin 0' prio=10 status=active
| `- 34:0:0:0 sdg 8:96  active ready running
|-+- policy='round-robin 0' prio=10 status=enabled
| `- 36:0:0:0 sdi 8:128 active ready running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 35:0:0:0 sdh 8:112 failed faulty running         <<------------------------path state again changed from (ii)-active to Failed 


iv)

[root@dhcp46-145 pvc-14befacf-aeb4-11e8-8e57-005056a59bf3]# multipath -ll
mpatha (36001405e9e0d58fe433463dba63113d4) dm-18 LIO-ORG ,TCMU device     
size=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 33:0:0:0 sdf 8:80  failed faulty running         <-----------------------Failed state - expected
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 34:0:0:0 sdg 8:96  failed faulty running          <-----------------------Failed state - unexpected
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 36:0:0:0 sdi 8:128 failed faulty running          <-----------------------Failed state - unexpected
`-+- policy='round-robin 0' prio=10 status=active
  `- 35:0:0:0 sdh 8:112 active ready running




Steps Performed:
=========================

1. Created a 4 node CNS 3.10 setup with RC build

2. Created 2 pvcs of HA=4 (bk4_glusterfs_block4-1_107d2230-aeb1-11e8-80c4-0a580a830204, bk4_glusterfs_block4-2_308c2e27-aeb1-11e8-80c4-0a580a830204)

3. setup has 1 file volume(heketidbstorage) and 1 BHV (vol_f51ea2467acb3ca749e32a9f1993da12)

4. Powered off first gluster node -10.70.46.169 for some time - Approx time = Sun Sep  2 13:31:34 UTC 2018 

5. Checked the path status, the path for 10.70.46.169 went into FAILED state as expected.

6. Powered on node 10.70.46.169  - Approx time = Sun Sep  2 13:35:33 UTC 2018. Path is restored.

7. Confirmed that all the paths are were in "RUNNING" state

8.  POWERED OFF another gluster node - 10.70.47.149 - Approx time  =Sun Sep  2 15:32:56 UTC 2018 . This was the active path sdf for the block device.

9. Checked the multipath status multiple times. each time more than 1 path is seen in failed state.  

10. POWERED ON 10.70.47.149 - Approx time = Sun Sep  2 15:48:13 UTC 2018 

11. Once the node/glusterfs pod is UP, I see all the 4 paths in RUNNING state. Hence, not sure why multiple paths were going on/off when only 1 path was down.


How reproducible:
++++++++++++++++++++++++
Its NOT a 100% reproducible issue and is intermittently seen.

Actual results:
++++++++++++++++++++++++
More than 1 path kept going in Failed->Running and vice versa state.

Expected results:
++++++++++++++++++++++++
When only 1 path is failed by user, all other paths should still stay in Running state and the IO  should continue from the first passive path is FAILED-OVER to.

Comment 37 Amar Tumballi 2018-09-19 09:48:49 UTC

> Prasanna, lets give Devel ack for this bug considering we will have RHGS fix available with OCS 3.11 release. If it didnt make it or the release timeline is not matching we will take back the acks.

Humble, at this point, RHGS would include this fix, but the release date is of the concern. Can't we handle it at higher level till we have RHGS version? that ways, we will have smooth dependency chain in the release?

Comment 64 Anjana KD 2019-01-07 08:25:11 UTC

Have updated the doc text. Kindly verify for doc text accuracy.

Comment 67 errata-xmlrpc 2019-02-07 03:38:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0285