Bug 2112703

Summary: [RDR] 2 busybox pods are stuck in containerCreating with message "Unable to attach or mount volumes" and never recover when sequential failover is performed
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Aman Agrawal <amagrawa>
Component: odf-drAssignee: Shyamsundar <srangana>
odf-dr sub component: ramen QA Contact: krishnaram Karthick <kramdoss>
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: unspecified CC: idryomov, jespy, kseeger, madam, mrajanna, muagarwa, ocs-bugs, odf-bz-bot, prsurve, sheggodu, srangana, ypadia
Version: 4.11   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2123501 (view as bug list) Environment:
Last Closed: 2022-11-08 17:58:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2023189    
Bug Blocks:    

Comment 7 Madhu Rajanna 2022-08-01 05:58:15 UTC
>I think so. 
And do you think it's under the right component, if not, pls change it?

It should go to csi/ceph not sure about it.

>Also, do we have a workaround? My setup is still in the same shape, and it is blocking the testing.

I don't have any workaround for now on this one.


>  Warning  FailedMount             9m34s (x38 over 77m)  kubelet                  MountVolume.SetUp failed for volume "pvc-ddbcd2b8-a304-4c38-b559-0c5d6b665fed" : applyFSGroup failed for vol 0001-0011-openshift-storage-0000000000000001-5e87657f-0cf4-11ed-8345-0a580a870036: lstat /var/lib/kubelet/pods/4672d7f6-23f8-481f-82e1-0119ccfe6cae/volumes/kubernetes.io~csi/pvc-ddbcd2b8-a304-4c38-b559-0c5d6b665fed/mount/data_1659175021: bad message


This could be due to filesystem corruption, @Ilya any idea about this one?

>I am not sure but the other bug was specifically for bitmani images only and here I see some other errors too:

The above is not an intermediate error message returned from the csi driver if the image is not healthy primary or not (removed in recent builds), This is not related to the issue. the error will go away once the image is promoted as healthy primary.

Comment 19 Madhu Rajanna 2022-08-30 03:48:01 UTC
You are hitting https://bugzilla.redhat.com/show_bug.cgi?id=2023189