Bug 1672492 - [Tracking] [OCS 3.11.1]: Few block PVCs fails to mount with the error 'failed to get any path for iscsi disk'
Summary: [Tracking] [OCS 3.11.1]: Few block PVCs fails to mount with the error 'failed...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-block
Version: ocs-3.11
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: OCS 3.11.z Batch Update 4
Assignee: Xiubo Li
QA Contact: Ashmitha Ambastha
URL:
Whiteboard:
Depends On: 1701858
Blocks: 1707226
TreeView+ depends on / blocked
 
Reported: 2019-02-05 06:55 UTC by Ashmitha Ambastha
Modified: 2019-10-07 15:16 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1701858 (view as bug list)
Environment:
Last Closed: 2019-07-22 10:39:42 UTC
Embargoed:


Attachments (Terms of Use)

Description Ashmitha Ambastha 2019-02-05 06:55:52 UTC
Description of problem:

On a 4 node OCS 3.11.1 cluster, created 50 Cirros app pods. Out of the 50 app pods, only 20 were in Running and 1/1 state. The rest, faced an issue of FailedMount because it failed to get any path for iscsi disk. 


Version-Release number of selected component (if applicable): OCS 3.11.1

How reproducible: On 25-30 block backed Cirros pods 

Steps to Reproduce:
1. Create a setup OCP 3.11(live) and OCS 3.11.1 builds. 
2. Create 50 Cirros pods and PVCs. All the 50 PVCs were in Bound state. But only 20 pods were in running state after deleting and restarting the pod creation thrice. 

Actual results: The mount failed and the cirros pods didn't come up. 

Expected results: The mount should pass and the cirros pods should come up.

Comment 3 Ashmitha Ambastha 2019-02-05 07:13:12 UTC
The pod has not come up at all even after trying to delete and recreate it.

Comment 4 Prasanna Kumar Kalever 2019-02-05 09:54:00 UTC
Can you please attach the sos-reports and other respective logs, such as block volume name, targetcli ls output and etc ..

Comment 12 Ashmitha Ambastha 2019-02-11 09:20:26 UTC
Hi, 

I've attached the gluster block prov pod logs, list of all necessary versions and the oc describe of the pod where the issue was seen. While taking sosreports of the gluster nodes, I was facing this error, 

 [plugin:lvm2] command 'pvs -a -v -o +pv_mda_free,pv_mda_size,pv_mda_count,pv_mda_used_count,pe_start --config="global{locking_type=0}"' timed out after 300s

and hence was not able to collect all the sosreports. 

Xiubo is looking into the set up now.

Comment 15 Prasanna Kumar Kalever 2019-03-05 07:28:29 UTC
Ashmitha, do we have any update here ?

Comment 16 Ashmitha Ambastha 2019-03-15 08:06:26 UTC
Prasanna, while I've not been able to hit the issue with the pkg which Xuibo has added. And have shared setups with the issue to Xuibo. The set up gets ruined with Nodes not being ready while collecting logs. This is the reason I'm unable to add logs to this bug. I have not tried this on a fresh set up as for 3.11.2 testings.

Comment 19 Prasanna Kumar Kalever 2019-03-28 05:43:55 UTC
Hello Ashmitha,

By any chance are these BHV's used by block volumes here, got created on lower version than OCS-3.11.1 ?

Thanks!

Comment 22 RamaKasturi 2019-03-28 17:23:54 UTC
Hello Vignesh,

   IIRC, you saying that once the custom packages which has some debug logs added we are no more able to reproduce the issue. Is that true ? If yes, can you please update the bug on how we are updating the custom package given in the setup ?

Thanks
kasturi


Note You need to log in before you can comment on or make changes to this bug.