Description of problem (please be detailed as possible and provide log snippests): The ocs-osd-removal job fails on OCS 4.6 with the following error: ocs-osd-removal-0-t67lq 0/1 Init:CreateContainerConfigError 0 2s Warning Failed 8m50s (x12 over 10m) kubelet, compute-0 Error: couldn't find key admin-secret in Secret openshift-storage/rook-ceph-mon It looks like the admin-secret key in the rook-ceph-mon secret is not present in OCS 4.6: $ oc get secret rook-ceph-mon -o yaml apiVersion: v1 data: ceph-secret: QVFDYWIyQmZreGVnTkJBQWJXRTBJbmR5dVRDWWZLYmtTTXpjRFE9PQ== ceph-username: Y2xpZW50LmFkbWlu fsid: ZDQ3ZjNhZGYtODJjNy00YTViLTk4ZDEtZmU1YTA1NDk0MjZh mon-secret: QVFDYWIyQmZlT3VwTVJBQUdPUllPUHlhdDQyaWJ2cnRtdzZEMmc9PQ== kind: Secret rook-ceph-mon secret from OCS 4.5: ================================== $ oc get secret rook-ceph-mon -o yaml apiVersion: v1 data: admin-secret: QVFBd2IxOWZzbE1yTEJBQTFwZnVFSlZ1SVJkM3RBNTUrRTgzdXc9PQ== cluster-name: b3BlbnNoaWZ0LXN0b3JhZ2U= fsid: ZjI1YWNkMDQtMzhkYi00NDk2LTk1NGEtODlmNDY3MmUyNjIx mon-secret: QVFBd2IxOWYxOTBGS1JBQW1XQ3c4ZEV4bWxubzBxOTAvWlJzanc9PQ== kind: Secret Version of all relevant components (if applicable): OCP: 4.6.0-0.nightly-2020-09-12-080441 ocs-operator.v4.6.0-553.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes, disk replacement can't be performed Is there any workaround available to the best of your knowledge? Not that I am aware of Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? No If this is a regression, please provide more details to justify this: Yes, because ocs-osd-removal job was successful in OCS 4.5 Steps to Reproduce: 1. Scale down osd to be replaced/removed 2. Run the osd removal job $ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_ID=${osd_id_to_remove} | oc create -f - 3. Check ocs-osd-removal pod status Actual results: The pod is in Init:CreateContainerConfigError Expected results: The job should be successful and the status should be Completed
This will be resolved by the PR already in progress. https://github.com/openshift/ocs-operator/pull/648
We have two opened BZs for the same issue, if the issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=1886348 is different than the one originally reported in this BZ then the current BZ should be moved back to ON_QA and can be verified once https://bugzilla.redhat.com/show_bug.cgi?id=1886348 is fixed. If both issues are same then https://bugzilla.redhat.com/show_bug.cgi?id=1886348 should be duped to this.
@Mudit I don't think both the issues are same. For now we can keep both the BZs. The current one and https://bugzilla.redhat.com/show_bug.cgi?id=1886348
Discussed offline with Servesha, this doesn't need any further code change, however the testing would be blocked till we have a fix for https://bugzilla.redhat.com/show_bug.cgi?id=1886348 Hence, moving it to MODIFIED. Will move to ON_QA once https://bugzilla.redhat.com/show_bug.cgi?id=1886348 is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605