Description of problem: When creating the checkup's Job, the following error occurs (could be seen on the Pod's description): ``` Warning FailedCreate 20s (x5 over 2m30s) job-controller Error creating: pods "kubevirt-vm-latency-checkup-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider "containerized-data-importer": Forbidden: not usable by user or serviceaccount, spec.containers[0].securityContext.runAsUser: Invalid value: 1000: must be in the ranges: [1000930000, 1000939999], provider "net-admin": Forbidden: not usable by user or serviceaccount, provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "noobaa": Forbidden: not usable by user or serviceaccount, provider "noobaa-endpoint": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "kubevirt-controller": Forbidden: not usable by user or serviceaccount, provider "bridge-marker": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "ocs-metrics-exporter": Forbidden: not usable by user or serviceaccount, provider "linux-bridge": Forbidden: not usable by user or serviceaccount, provider "kubevirt-handler": Forbidden: not usable by user or serviceaccount, provider "rook-ceph": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "trident": Forbidden: not usable by user or serviceaccount, provider "rook-ceph-csi": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount] ``` Version-Release number of selected component (if applicable): 4.12.0 How reproducible: Steps to Reproduce: 1. Create a NetworkAttachmentDefinition 2. Configure the user-supplied ConfigMap 3. Create the checkup's Job: ``` --- apiVersion: batch/v1 kind: Job metadata: name: kubevirt-vm-latency-checkup spec: backoffLimit: 0 template: spec: serviceAccountName: vm-latency-checkup-sa restartPolicy: Never containers: - name: vm-latency-checkup image: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-vm-network-latency-checkup:v4.12.0 securityContext: runAsUser: 1000 allowPrivilegeEscalation: false capabilities: drop: ["ALL"] runAsNonRoot: true seccompProfile: type: "RuntimeDefault" env: - name: CONFIGMAP_NAMESPACE value: <target-namespace> - name: CONFIGMAP_NAME value: kubevirt-vm-latency-checkup-config ``` 4. Describe the created Pod. Actual results: The checkup Job underlying pod doesn't start. Expected results: The checkup Job underlying pod should start. Additional info: Doing all actions as a project-admin.
This is needed to address https://issues.redhat.com/browse/CNV-18990. @awax would you get a chance to verify that this is reproducible before we release a fix?
@omisan Can you please provide a full reproduction scenario? I am missing the NAD manifest from steps #1, and the ConfigMap from step #2. Thank you.
Hi @ysegev, the problem was at the Dockerfile used to create the checkup's image, You can input any valid ConfigMap and NetworkAttachmentDefinition.
Please note that we no longer use the `runAsUser` field, because that information is baked to the container image.
"You can input any valid ConfigMap and NetworkAttachmentDefinition" is a sure recipe for mis-configuration. When reproducing/verifying a bug, the exact scenario is needed. This includes the exact resources used, otherwise the verifier might find themselves attempting to reproduce using non-relevant or false resources. Please provide the NAD and ConfigMAp so I can verify this bug. Thank you
In this specific scenario, the problem was that the checkup Job couldn't start because of the `runAsUser` definition. Because the Job couldn't start, the ConfigMap and NetworkAttanchemtDefinitions were never read, so their content matters less with regard to reproducing this bug. ```yaml apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: bridge-network spec: config: | { "cniVersion":"0.3.1", "name": "br10", "plugins": [ { "type": "cnv-bridge", "bridge": "br10" } ] } ``` ```yaml apiVersion: v1 kind: ConfigMap metadata: name: kubevirt-vm-latency-checkup-config data: spec.timeout: 5m spec.param.network_attachment_definition_namespace: <target_namespace> spec.param.network_attachment_definition_name: <nad_name>" spec.param.max_desired_latency_milliseconds: "10" spec.param.sample_duration_seconds: "5" ``` The checkup Job now looks like this (without `runAsUser`): ```yaml apiVersion: batch/v1 kind: Job metadata: name: kubevirt-vm-latency-checkup spec: backoffLimit: 0 template: spec: serviceAccountName: vm-latency-checkup-sa restartPolicy: Never containers: - name: vm-latency-checkup image: registry.redhat.io/container-native-virtualization/vm-network-latency-checkup:v4.12.0 securityContext: allowPrivilegeEscalation: false capabilities: drop: ["ALL"] runAsNonRoot: true seccompProfile: type: "RuntimeDefault" env: - name: CONFIGMAP_NAMESPACE value: <target_namespace> - name: CONFIGMAP_NAME value: kubevirt-vm-latency-checkup-config ```
Verified by using these versions: CNV 4.12.0 vm-network-latency-checkup:v4.12.0-8 Verified by running the following scenario: 1. Create a new namespace and change the context to it: $ oc create ns yoss-ns namespace/yoss-ns created $ $ oc project yoss-ns Now using project "yoss-ns" on server "https://api.net-ys-412-2.cnv-qe.rhcloud.com:6443". 2. Apply the following NetworkAttachmentDefinition: $ cat << EOF | oc apply -f - apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: bridge-network spec: config: | { "cniVersion":"0.3.1", "name": "br10", "plugins": [ { "type": "cnv-bridge", "bridge": "br10" } ] } EOF 3. Apply the following ConfigMap: $ cat << EOF | oc apply -f - apiVersion: v1 kind: ConfigMap metadata: name: kubevirt-vm-latency-checkup-config data: spec.timeout: 5m spec.param.network_attachment_definition_namespace: "yoss-ns" spec.param.network_attachment_definition_name: "bridge-network" spec.param.max_desired_latency_milliseconds: "10" spec.param.sample_duration_seconds: "5" EOF 4. Apply the foolwing ServiceAccounts, Role and RoleBinding: $ cat << EOF | oc apply -f - --- apiVersion: v1 kind: ServiceAccount metadata: name: vm-latency-checkup-sa --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: kubevirt-vm-latency-checker rules: - apiGroups: ["kubevirt.io"] resources: ["virtualmachineinstances"] verbs: ["get", "create", "delete"] - apiGroups: ["subresources.kubevirt.io"] resources: ["virtualmachineinstances/console"] verbs: ["get"] - apiGroups: ["k8s.cni.cncf.io"] resources: ["network-attachment-definitions"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: kubevirt-vm-latency-checker subjects: - kind: ServiceAccount name: vm-latency-checkup-sa roleRef: kind: Role name: kubevirt-vm-latency-checker apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: kiagnose-configmap-access rules: - apiGroups: [ "" ] resources: [ "configmaps" ] verbs: ["get", "update"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: kiagnose-configmap-access subjects: - kind: ServiceAccount name: vm-latency-checkup-sa roleRef: kind: Role name: kiagnose-configmap-access apiGroup: rbac.authorization.k8s.io 5. Apply the following latency Job: $ cat << EOF | oc apply -f - apiVersion: batch/v1 kind: Job metadata: name: kubevirt-vm-latency-checkup spec: backoffLimit: 0 template: spec: serviceAccountName: vm-latency-checkup-sa restartPolicy: Never containers: - name: vm-latency-checkup # image: registry.redhat.io/container-native-virtualization/vm-network-latency-checkup:v4.12.0 image: brew.registry.redhat.io/rh-osbs/container-native-virtualization-vm-network-latency-checkup:v4.12.0 securityContext: allowPrivilegeEscalation: false capabilities: drop: ["ALL"] runAsNonRoot: true seccompProfile: type: "RuntimeDefault" env: - name: CONFIGMAP_NAMESPACE value: yoss-ns - name: CONFIGMAP_NAME value: kubevirt-vm-latency-checkup-config EOF The job and its pod run successfully: $ oc get job NAME COMPLETIONS DURATION AGE kubevirt-vm-latency-checkup 0/1 56s 56s $ $ oc get pod NAME READY STATUS RESTARTS AGE kubevirt-vm-latency-checkup-lxdwc 1/1 Running 0 61s virt-launcher-latency-check-source-4d4sk 2/2 Running 0 58s virt-launcher-latency-check-target-8tch5 2/2 Running 0 58s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.12.0 Images security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:0408