Bug 1654409

Summary: For a particular block device,initiator could login to only 2 target portals instead of 3
Product: Red Hat Gluster Storage Reporter: Neha Berry <nberry>
Component: gluster-blockAssignee: Prasanna Kumar Kalever <prasanna.kalever>
Status: CLOSED DUPLICATE QA Contact: Neha Berry <nberry>
Severity: high Docs Contact:
Priority: unspecified    
Version: ocs-3.11CC: bgoyal, kramdoss, pkarampu, pprakash, prasanna.kalever, rhs-bugs, sankarshan, vbellur, vinug, xiubli
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-07 08:31:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1624670, 1624678    
Bug Blocks:    

Description Neha Berry 2018-11-28 17:32:03 UTC
Description of problem:
++++++++++++++++++++++++
We were validating the fix https://bugzilla.redhat.com/show_bug.cgi?id=1596021 when we hit this issue.

Created 20 app pods with existing block pvcs with HA=3.

It is seen , for one app pod, only 2 target portals were successfully logged in from the initiator side ( checked in 'iscsiadm -m session'.)
Hence, mpath and iscsiadm listed only 2 paths instead of the expected 3 paths.

For all other 19 pods, we had 3 paths each in iscsiadm -m sesison.

Hence we can't say that one particular target portal was always down or having a permanent issue. It seems just for this block device(block6), login failed to a particular target portal.

Note: The original issue of BZ#1596021 (Single path was not mounted to the app pod) was not hit in this case. In BZ#1596021, iscsiadm logins used to be 3, but mpath used to list less than 3 paths. In this case, iscsiadm login itself failed to one target portal IP.

Some details. 
======================

[root@dhcp41-225 ~]# iscsiadm -m session|wc -l
59   <----- 1 less as total count should be 20*3 = 60

[root@dhcp41-225 ~]# df -kh|grep mpath|wc -l
20


block volume name = blk_app-prj_block6_c07841a6-ef35-11e8-ab37-0a580a810203

iqn in question = iqn.2016-12.org.gluster-block:ba6c19bd-da6a-4184-80f0-ae4431fc4adc

pvc in question = app-prj/block6

pv in question = pvc-c06df5c5-ef35-11e8-a087-005056a529f3

app pod in question = cirrosblock6-1-796cd



Version-Release number of selected component (if applicable):
=================================

# oc version
oc v3.11.43

# rpm -qa|grep ansible
openshift-ansible-playbooks-3.11.43-1.git.0.fa69a02.el7.noarch
ansible-2.6.7-1.el7ae.noarch
openshift-ansible-roles-3.11.43-1.git.0.fa69a02.el7.noarch
openshift-ansible-docs-3.11.43-1.git.0.fa69a02.el7.noarch
openshift-ansible-3.11.43-1.git.0.fa69a02.el7.noarch



# oc rsh glusterfs-storage-58btb rpm -qa|grep gluster
glusterfs-fuse-3.12.2-27.el7rhgs.x86_64
python2-gluster-3.12.2-27.el7rhgs.x86_64
glusterfs-server-3.12.2-27.el7rhgs.x86_64
gluster-block-0.2.1-29.el7rhgs.x86_64
glusterfs-api-3.12.2-27.el7rhgs.x86_64
glusterfs-cli-3.12.2-27.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-27.el7rhgs.x86_64
glusterfs-libs-3.12.2-27.el7rhgs.x86_64
glusterfs-3.12.2-27.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-27.el7rhgs.x86_64


OCS 3.11.1


How reproducible:
====================
Intermittently, for 20 app pods, seen only for 1 app pod.

Steps to Reproduce:
1. Create 100 block pvcs
2. Attach 20 block pvcs to an app pod suing following loop :
 for i in {1..20}; do ./cirros-create.sh  block$i block$i ; sleep 1; done

3. Check the iscsiadm -m session on the initiator nodes and confirm the counts.

Actual results:
===============
with HA=3, iscsiadm -m session doesn't list all the 3 target portals for one block device backed app pod

Expected results:
====================
All 3 target portals should be logged in successfully from the initiator side.

Comment 7 Prasanna Kumar Kalever 2019-02-07 08:31:54 UTC

*** This bug has been marked as a duplicate of bug 1597320 ***