Description of problem (please be detailed as possible and provide log snippets): When the odf-vault-auth serviceaccount is deleted and the OSD pod is respun, the pod goes into Init:CrashLoopBackOff state as expected. The logs from the encryption-kms-get-kek container in the OSD pod show the following error message: $ oc logs -c encryption-kms-get-kek rook-ceph-osd-1-bb55c5d5-645h2 2022-03-08 07:20:27.648973 C | rookcmd: failed to validate kms connection details: failed to get backend version: failed to initialize vault client: failed to get vault authentication token: Error making API request. URL: PUT https://vault.default.svc.cluster.local:8200/v1/auth/kubernetes/login Code: 403. Errors: * permission denied Since the authentication method is kubernetes, the error message should mention failure to get the serviceaccount instead of the authentication token. Version of all relevant components (if applicable): --------------------------------------------------- OCP: 4.10.0-0.nightly-2022-03-08-002944 ODF: odf-operator.v4.10.0 full_version=4.10.0-179 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? No Is there any workaround available to the best of your knowledge? N/A Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: No Steps to Reproduce: ------------------- 1. Deploy an ODF cluster with clusterwide encryption enabled using KMS kubernetes authentication method. 2. Once the cluster is up and running, delete the odf-vault-auth SA $ oc delete sa odf-vault-auth serviceaccount "odf-vault-auth" deleted 3. Respin one of the OSD pods $ oc delete pod rook-ceph-osd-1-bb55c5d5-5n2qf pod "rook-ceph-osd-1-bb55c5d5-5n2qf" deleted 4. The OSD pod should go into Init:CrashLoopBackOff state. Check the OSD logs. $ oc logs -c encryption-kms-get-kek rook-ceph-osd-1-bb55c5d5-645h2 Actual results: --------------- The error message as shown below mentions about missing authentication token. 2022-03-08 07:20:27.648973 C | rookcmd: failed to validate kms connection details: failed to get backend version: failed to initialize vault client: failed to get vault authentication token: Error making API request. URL: PUT https://vault.default.svc.cluster.local:8200/v1/auth/kubernetes/login Code: 403. Errors: * permission denied Expected results: ----------------- The error message should mention about the missing serviceaccount.
It's hard to actually make the distinction here. Also the message "failed to get vault authentication token:" is not the last error, it's just in the chain of errors. I understand the confusion but internally when using kube auth with vault, a token is also used for authentication. So the message is actually correct. I can probably make the error different if you think it can avoid the confusion, even though technically the error is correct. What do you think?
Sounds good then.