Making this BZ as urgent.. Assuming replacing the osds in the only way forward this could caused data loss for other customers... Also, to add onto the env, this a fresh cluster that was deployed last week: $ cat storagecluster.yaml apiVersion: v1 items: - apiVersion: ocs.openshift.io/v1 kind: StorageCluster metadata: annotations: cluster.ocs.openshift.io/local-devices: "true" uninstall.ocs.openshift.io/cleanup-policy: delete uninstall.ocs.openshift.io/mode: graceful creationTimestamp: "2024-09-05T17:22:12Z"
I also forgot to mention, this happened on the ocs4 node earlier this week. We ended up replacing the osds to resolve the issue..
Would we need to configure something like [1] to see what accesses the device to cause the wipe? Is it too late to do anything? [1] https://access.redhat.com/solutions/7039896
The customer did a full redeployment and install 4.12.12 and still has the same issue. I dont think this is the same bug, or it it is, we didn't fix it in this version. Customer is uploading a fresh ODF must-gather now... I'm also reviewing and will post findings when ive reviewed.
Sorry 4.12.14
Getting the file for 'dd if=/dev/sdb of=/tmp/block.8k.dump bs=4K count=2'
Still have came up empty. The customer won't share their playbooks with us, nor have we been able to reproduce this issue. Atm, the case is pending on them to set up another call with us.