Bug 2023189

Summary: [DR] Rbd image mount failed on pod saying applyFSGroup failed for vol
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Pratik Surve <prsurve>
Component: odf-drAssignee: Shyamsundar <srangana>
odf-dr sub component: unclassified QA Contact: Sidhant Agrawal <sagrawal>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: amagrawa, ebenahar, idryomov, jespy, kramdoss, kseeger, muagarwa, ndevos, odf-bz-bot, srangana, ypadia
Version: 4.9   
Target Milestone: ---   
Target Release: ODF 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.14.0-103 Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 18:49:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2207918    
Bug Blocks: 2112703, 2244409    

Description Pratik Surve 2021-11-15 07:27:30 UTC
Description of problem (please be detailed as possible and provide log
snippests):
[DR] Rbd image mount failed on pod saying  applyFSGroup failed for vol 0001-0011-openshift-storage-0000000000000001-c8fa42ef-4260-11ec-8beb-0a580a810228: lstat /var/lib/kubelet/pods/bd5b929b-f03f-4a5c-b6b2-195f14ea6268/volumes/kubernetes.io~csi/pvc-9ab6d446-517c-4e84-be84-b21085c70d84/mount/data/mydatabase: bad message



Version of all relevant components (if applicable):

ODF version:- 4.9.0-230.ci
OCP version:- 4.9.0-0.nightly-2021-11-08-084355

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy DR cluster
2. Run workloads 
3. Do a failover


Actual results:

Events:
  Type     Reason       Age                     From     Message
  ----     ------       ----                    ----     -------
  Warning  FailedMount  30m (x523 over 3d17h)   kubelet  Unable to attach or mount volumes: unmounted volumes=[mysql-persistent-storage], unattached volumes=[kube-api-access-dh5rl mysql-persistent-storage]: timed out waiting for the condition
  Warning  FailedMount  16m (x1863 over 3d18h)  kubelet  Unable to attach or mount volumes: unmounted volumes=[mysql-persistent-storage], unattached volumes=[mysql-persistent-storage kube-api-access-dh5rl]: timed out waiting for the condition
  Warning  FailedMount  97s (x2625 over 3d17h)  kubelet  MountVolume.SetUp failed for volume "pvc-9ab6d446-517c-4e84-be84-b21085c70d84" : applyFSGroup failed for vol 0001-0011-openshift-storage-0000000000000001-c8fa42ef-4260-11ec-8beb-0a580a810228: lstat /var/lib/kubelet/pods/bd5b929b-f03f-4a5c-b6b2-195f14ea6268/volumes/kubernetes.io~csi/pvc-9ab6d446-517c-4e84-be84-b21085c70d84/mount/data/mydatabase: bad message


Expected results:


Additional info:

Mysql image used:- quay.io/bitnami/mysql@sha256:91e90e5d5a513eeddd9939ec186128d0d5e93e9ad1689311370da0a3b4da90d2

Comment 2 Humble Chirammal 2021-11-15 07:39:17 UTC
Pratik, can you share the pod spec of mysql app?

also, are you facing this issue for any other POD specs?

Was this working in any of the earlier builds of 4.9? or this is the first time we are observing this issue?

Comment 7 yati padia 2022-02-23 02:46:29 UTC
Moving this out of 4.10

Comment 8 Mudit Agarwal 2022-05-26 09:40:50 UTC
Yati, what is the plan for this BZ. Can we fix it in 4.11?

Comment 9 yati padia 2022-05-26 11:09:30 UTC
Hi Mudit, We need some DR testing here. I am not sure how much time it will take. I will confirm by next week.

Comment 12 krishnaram Karthick 2022-08-30 04:06:30 UTC
This is a blocker for RDR

Comment 13 krishnaram Karthick 2022-08-30 04:08:22 UTC
@Madhu - couple of questions
1) is there a better way to reproduce this issue for verification? (We don't hit this consistently)
2) Could there be a workaround for RDR TP?

Comment 19 Madhu Rajanna 2022-09-05 04:37:04 UTC
Adding need info on reported as per #16

Comment 20 Mudit Agarwal 2022-09-29 01:47:39 UTC
Looks like we need to re-evaluate the blocker? flag here

Comment 31 Shyamsundar 2022-11-08 17:58:40 UTC
*** Bug 2112703 has been marked as a duplicate of this bug. ***

Comment 69 errata-xmlrpc 2023-11-08 18:49:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832

Comment 70 Red Hat Bugzilla 2024-03-08 04:25:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days