Description of problem (please be detailed as possible and provide log snippests): In earlier versions (<4.10) there used to be a message in logs (without timestamps): level=info msg="✅ Exists: Secret \"ocs-kms-token\"\n" level=info msg="✅ (Optional) Exists: Secret \"ocs-kms-ca-secret\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-cert\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-key\"\n" level=info msg="✅ (Optional) Exists: Secret \"ocs-kms-ca-secret\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-cert\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-key\"\n" level=info msg="✅ Exists: Secret \"ocs-kms-token\"\n" level=info msg="ReconcileRootSecret: found root secret in external KMS successfully" sys=openshift-storage/noobaa For version ODF 4.10 this changes to error message: level=info msg="✅ (Optional) Exists: \"ocs-kms-ca-secret\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-cert\"\n" level=info msg="✅ (Optional) Exists: \"ocs-kms-client-key\"\n" level=info msg="✅ Exists: \"ocs-kms-token\"\n" level=info msg="setKMSConditionType vault" sys=openshift-storage/noobaa level=error msg="ReconcileRootSecret, KMS Get error No Secret Data found for Secret ID" sys=openshift-storage/noobaa level=info msg="setKMSConditionStatus Init" sys=openshift-storage/noobaa From logs it looks like there is a change in kms functionality that could break encryption process. Version of all relevant components (if applicable): ODF 4.10.0-217 Steps to Reproduce: 1. Install ODF with cluster wide encryption. 2. Check noobaa-operator logs Actual results: There is an error log:ReconcileRootSecret, KMS Get error No Secret Data found for Secret ID Expected results: There used to be log messagge: ReconcileRootSecret: found root secret in external KMS successfully Additional info: ODF 4.10 noobaa-operator logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-010vukv11cs33-t1/j-010vukv11cs33-t1_20220404T095039/logs/failed_testcase_ocs_logs_1649069382/test_noobaa_kms_validation_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-3a6619c418c74824b6a69dfce26f2070aae8b7d56b7cea756ab68d5e2c459e07/namespaces/openshift-storage/pods/noobaa-operator-bcd7f9c57-v2x2g/noobaa-operator/noobaa-operator/logs/current.log ODF 4.9 noobaa-operator logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-002vukv11cs33-t1/j-002vukv11cs33-t1_20220328T062106/logs/failed_testcase_ocs_logs_1648453414/test_failure_propagator_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-1b03fa95946c0265bfa57a63f2a5c141c16524a4db5bad56406c2026028cbb60/namespaces/openshift-storage/pods/noobaa-operator-55755594b9-p27l4/noobaa-operator/noobaa-operator/logs/current.log
Hello Filip, The KNS module implementation was changed in the following PR https://github.com/noobaa/noobaa-operator/pull/797. The log messages are no longer expected to match. There are new mechanisms for troubleshooting: new condition statuses on NooBaa CR. The internal state of the KMS subsystem is exposed via two new condition types: "KMS-Status" "KMS-Type" Both condition statuses are set during each reconciliation cycle. So regarding KMS-Type, per log the type is vault, as expected. > 2022-04-04T10:38:11.259392453Z time="2022-04-04T10:38:11Z" level=info msg="setKMSConditionType vault" sys=openshift-storage/noobaa There are several possible values for the "KMS-Status": "Init" - External KMS initialized "Sync" - The root key was synchronized from external KMS "Invalid" - Invalid external KMS definition "ErrorRead" - Error reading secret from external KMS "ErrorWrite" - Error writing initial root key to eternal KMS During installation, the expected status transition is "Init" -> "Sync". After the initial transition, the status should remain "Sync" And indeed according to logs: > 2022-04-04T10:36:37.040505500Z time="2022-04-04T10:36:37Z" level=info msg="setKMSConditionStatus Init" sys=openshift-storage/noobaa > 2022-04-04T10:37:37.634139287Z time="2022-04-04T10:37:37Z" level=info msg="setKMSConditionStatus Sync" sys=openshift-storage/noobaa > 2022-04-04T10:38:11.081543011Z time="2022-04-04T10:38:11Z" level=info msg="setKMSConditionStatus Sync" sys=openshift-storage/noobaa > 2022-04-04T10:38:11.278200788Z time="2022-04-04T10:38:11Z" level=info msg="setKMSConditionStatus Sync" sys=openshift-storage/noobaa ... Let me describe the logic of status transitions. "Init" happens only once during the initial system startup, the operator tries to fetch the master root key from the external KMS storage, however initially, it is empty, so a random master key is generated by the operator and written into the external KMS storage. Upon successful write completion, the status is set to "Init". Upon subsequent operator KMS reconciliation, the operator would read successfully the key from the external KMS storage and the status is set to "Sync". So while the initial transition to "Init" both those log messages are expected: > 2022-04-04T10:36:37.018701650Z time="2022-04-04T10:36:37Z" level=error msg="ReconcileRootSecret, KMS Get error No Secret Data found for Secret ID" sys=openshift-storage/noobaa > 2022-04-04T10:36:37.040505500Z time="2022-04-04T10:36:37Z" level=info msg="setKMSConditionStatus Init" sys=openshift-storage/noobaa Guess the phrasing which includes the word "error" is not the best one for the expected state. So to sum up, the logs indicate an expected status transition. Let me know if you have any more question. Thank you, ~baum
Any news here? Is the bug supposed to be fixed now? If so, please move to ON_QA
Error log message "ReconcileRootSecret, KMS Get error No Secret Data found for Secret ID" is still present in logs but from context it seems that this is expected (a query for the secret that is created later?). The message is present before Init and Sync happen. Tested with: ocp 4.12.0-0.nightly-2022-11-07-181244 odf 4.12.0-91