Description of problem: virt-launcher pod on a regular namespace (not named openshift-*) cannot be started on OCP 4.12. On the status on the VMI we see: status: conditions: - lastProbeTime: "2022-08-15T06:40:54Z" lastTransitionTime: "2022-08-15T06:40:54Z" message: virt-launcher pod has not yet been scheduled reason: PodNotExists status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2022-08-15T06:40:54Z" message: 'failed to create virtual machine pod: pods "virt-launcher-testvm-6xks9" is forbidden: violates PodSecurity "restricted:v1.24": seLinuxOptions (pod and container "volumecontainerdisk" set forbidden securityContext.seLinuxOptions: type "virt_launcher.process"), allowPrivilegeEscalation != false (containers "container-disk-binary", "volumecontainerdisk-init", "compute", "volumecontainerdisk" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "container-disk-binary", "volumecontainerdisk-init", "compute", "volumecontainerdisk" must set securityContext.capabilities.drop=["ALL"]; container "compute" must not include "SYS_NICE" in securityContext.capabilities.add), runAsNonRoot != true (pod or containers "container-disk-binary", "compute" must set securityContext.runAsNonRoot=true), runAsUser=0 (pod and containers "container-disk-binary", "compute" must not set runAsUser=0), seccompProfile (pod or containers "container-disk-binary", "volumecontainerdisk-init", "compute", "volumecontainerdisk" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")' reason: FailedCreate status: "False" type: Synchronized created: true printableStatus: Starting Version-Release number of selected component (if applicable): CNV 4.12 on OCP 4.12 How reproducible: we are constantly getting this on CI on OCP 4.12 Steps to Reproduce: 1. try to start a VM on OCP 4.12 2. 3. Actual results: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.12-e2e-azure-upgrade-cnv/1559052348230209536/artifacts/e2e-azure-upgrade-cnv/test/artifacts/cnv-must-gather-vms/registry-redhat-io-container-native-virtualization-cnv-must-gather-rhel8-sha256-37a2b2f102544ec8e953b473f85505e1d999aa5fde09e1385ebfa365fc4aa732/namespaces/vmsns/kubevirt.io/virtualmachines/custom/testvm.yaml Expected results: no PodSecurity related error for virt-launcher Additional info: All the logs are available here: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.12-e2e-azure-upgrade-cnv/1559052348230209536/artifacts/e2e-azure-upgrade-cnv/test/artifacts/cnv-must-gather-vms/
Proposing this as a blocker.
While Lubo merged the required changes in Kubevirt we're still missing a change in HCO to enable PSA feature gate by default, @stirabos can you comment here when you post the HCO PR?
PSA FG on Kubevirt is now always enabled on HCO managed deployments.
Verified on v4.12.0-535 - VM can be succesfully started: > $ oc get pod > NAME READY STATUS RESTARTS AGE > virt-launcher-vm-fedora-f558p 2/2 Running 0 8s But as I see, we do not revert back labels of the namespace when VM removed: 1) created new namespace - it has default labels: > $ oc describe ns namespace-test > Name: namespace-test > Labels: kubernetes.io/metadata.name=namespace-test > pod-security.kubernetes.io/enforce=restricted > pod-security.kubernetes.io/enforce-version=v1.24 2) Created and started VM in this namespace - labels updated: > $ oc describe ns namespace-test > Name: namespace-test > Labels: kubernetes.io/metadata.name=namespace-test > pod-security.kubernetes.io/enforce=privileged > pod-security.kubernetes.io/enforce-version=v1.24 > security.openshift.io/scc.podSecurityLabelSync=false 3) Removed VM - labels still the same (not reverted back): > $ oc get vm > No resources found in namespace-test namespace. > $ oc get vmi > No resources found in namespace-test namespace. > $ oc get pod > No resources found in namespace-test namespace. > $ oc describe ns namespace-test > Name: namespace-test > Labels: kubernetes.io/metadata.name=namespace-test > pod-security.kubernetes.io/enforce=privileged > pod-security.kubernetes.io/enforce-version=v1.24 > security.openshift.io/scc.podSecurityLabelSync=false
Removing the label seems to me an optional thing. The reason is that users can already use the namespace as a privilege for the duration while the VM is running and therefore you are giving him trust.
I think it would be good to document this behavior.. Moving this bz to verified state
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.12.0 Images security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:0408