Description of problem (please be detailed as possible and provide log snippets): In OCS 4.7.2, creation of encrypted RBD PVCs fail with the following error: Warning ProvisioningFailed 42s (x8 over 105s) openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b66b9959c-2smsm_c03689da-56d0-41a3-9ed0-48842e94c381 failed to provision volume with StorageClass "test-pv-encryption": rpc error: code = InvalidArgument desc = invalid encryption kms configuration: missing encryption KMS configuration with 1-vault However, the 1-vault config is present in the csi-kms-connection-details configmap: $ oc get cm csi-kms-connection-details -o yaml -n openshift-storage apiVersion: v1 data: 1-vault: '{"KMS_PROVIDER":"vaulttokens","KMS_SERVICE_NAME":"vault","VAULT_ADDR":"https://vault.qe.rh-ocs.com:8200","VAULT_BACKEND_PATH":"rbd-encryption","VAULT_CACERT":"ocs-kms-ca-secret-iv4cta","VAULT_TLS_SERVER_NAME":"","VAULT_CLIENT_CERT":"ocs-kms-client-cert-u6yuiq","VAULT_CLIENT_KEY":"ocs-kms-client-key-gz0zb","VAULT_NAMESPACE":"ocs/rbd","VAULT_TOKEN_NAME":"ocs-kms-token","VAULT_CACERT_FILE":"fullchain.pem","VAULT_CLIENT_CERT_FILE":"cert.pem","VAULT_CLIENT_KEY_FILE":"privkey.pem"}' kind: ConfigMap Version of all relevant components (if applicable): OCP: 4.8.0-0.nightly-2021-06-25-182927 OCS: ocs-operator.v4.7.2-429.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes, not able to create encrypted RBD PVCs Is there any workaround available to the best of your knowledge? Not that I am aware of Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Yes, PV encryption was working in OCS 4.7.0 and 4.7.1 as well. It was also tested on 4.7.2-rc1 build and was working fine. The issue is seen with the live build of OCS 4.7.2 Steps to Reproduce: 1. Deploy an OCS cluster with live 4.7.2 builds 2. Create an encryption enabled storageclass for RBD 3. Create a PVC using the SC created above Actual results: PVC creation fails with error: Warning ProvisioningFailed 42s (x8 over 105s) openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b66b9959c-2smsm_c03689da-56d0-41a3-9ed0-48842e94c381 failed to provision volume with StorageClass "test-pv-encryption": rpc error: code = InvalidArgument desc = invalid encryption kms configuration: missing encryption KMS configuration with 1-vault Expected results: PVC creation should be successful. Additional info: This issue is not seen in OCS 4.8 builds
This is only seen in 4.7.2-RC2 build where the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1974816 went. Keeping it for 4.7.z
Niels, please fill the doc text
There is a workaround for users that want to run the previous released Ceph-CSI container image where PV encryption with Hashicorp Vault Token support was working correctly. I am not in a position to qualify this as a supported workaround, but for testing the functionality this should be sufficient. Check the tag (or sha256) from the image registry, for example https://catalog.redhat.com/software/containers/ocs4/cephcsi-rhel8/5ddeeeaabed8bd164a0afa64?tag=4.7-104.60731ec.release_4.7 Install the OCS Operator from OperatorHub through the UI. Create a StorageCluster once the Operator is installed. When the StorageCluster becomes available, updated the CSV in the `openshift-storage` namespace: $ oc -n openshift-storage get csv NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.7.2 OpenShift Container Storage 4.7.2 Succeeded Edit the CVS, and replace the references for the `cephcsi-rhel8` image to the image from OCS-4.7.1. $ oc -n openshift-storage edit csv/ocs-operator.v4.7.2 Look for the section `deployments:` and find the environment variables for the `rook-ceph-operator`: - name: ROOK_CSI_CEPH_IMAGE value: registry.redhat.io/ocs4/cephcsi-rhel8@sha256:d516aa76acf0ef657919f3d4d3647de8944efb8ce9684b7058fc22a5a7321f10 replace the image with the version from OCS-4.7.1: - name: ROOK_CSI_CEPH_IMAGE value: registry.redhat.io/ocs4/cephcsi-rhel8:4.7-104.60731ec.release_4.7 This will cause the deployments and daemonsets for the Ceph-CSI to be updated, and the pods related to Ceph-CSI will get restarted. $ oc -n openshift-storage get deployment/csi-rbdplugin-provisioner $ oc -n openshift-storage get daemonset/csi-rbdplugin $ oc -n openshift-storage get pods -l app=csi-rbdplugin-provisioner $ oc -n openshift-storage get pods -l app=csi-rbdplugin Verify that the previous container image is used: $ oc -n openshift-storage describe pod/csi-rbdplugin-provisioner-fc6bddf8f-6pb6k | grep -m1 4.7-104.60731ec.release_4.7 Image: registry.redhat.io/ocs4/cephcsi-rhel8:4.7-104.60731ec.release_4.7
It seems that the workaround in comment #9 is not always sufficient. Clusters that run a little longer (days instead of deploy-test-discard) Enable the Ceph Toolbox: $ oc -n openshift-storage edit ocsinitializations.ocs.openshift.io/ocsinit replace `spec: {}` with spec: enableCephTools: true check for the running toolbox pod: $ oc -n openshift-storage get pods -l app=rook-ceph-tools NAME READY STATUS RESTARTS AGE rook-ceph-tools-5d76f864fd-6bhkk 1/1 Running 0 2m26s RSH into the Pod, and change the settings to allow connecting from non-current clients (like the Ceph-CSI container images from 4.7.1): $ oc -n openshift-storage rsh rook-ceph-tools-5d76f864fd-6bhkk sh-4.4# ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false sh-4.4# ceph config set mon auth_allow_insecure_global_id_reclaim true By setting these options, the checks for https://docs.ceph.com/en/latest/security/CVE-2021-20288/ are disabled, and the non-patched (old Ceph-CSI) clients can connect. These checks can be enabled again once the container image with the fix for this bug is deployed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.7.3 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3135
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days