Description of problem: The csv-generator for the hostpath-provisioner-operator produces an invalid Deployment spec because it sets RunAsNonRoot: true but does not specify the uid to run as. The result is that hpp will not deploy properly when installed by the CNV HCO Version-Release number of selected component (if applicable): 2.2.0 How reproducible: Always Steps to Reproduce: 1. Install CNV-2.2 using HCO 2. Create HPP CR 3. Actual results: HPP operator pod fails with the following message: "Error: container has runAsNonRoot and image will run as root" Expected results: Successful HPP deployment Additional info:
PRs to fix this, and the backport.
Both PRs merged.
No idea if it's related, but I have just deployed 2.3 and I don't see HPP pod there. 2.3 csv reports a Failure: >> oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v2.3.0 Container-native virtualization 2.3.0 kubevirt-hyperconverged-operator.v2.2.0 Failed >> oc describe csv Type Reason Age From Message ---- ------ ---- ---- ------- Normal RequirementsUnknown 54m (x2 over 54m) operator-lifecycle-manager requirements not yet checked Normal RequirementsNotMet 54m (x2 over 54m) operator-lifecycle-manager one or more requirements couldn't be found Normal InstallWaiting 53m (x2 over 53m) operator-lifecycle-manager installing: Waiting: waiting for deployment virt-operator to become ready: Waiting for rollout to finish: 1 of 2 updated replicas are available... Warning InstallCheckFailed 49m operator-lifecycle-manager install timeout Normal NeedsReinstall 49m (x2 over 49m) operator-lifecycle-manager installing: Waiting: waiting for deployment hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available... Normal AllRequirementsMet 49m (x3 over 54m) operator-lifecycle-manager all requirements found, attempting install Normal InstallSucceeded 49m (x4 over 54m) operator-lifecycle-manager waiting for install components to report healthy Normal InstallWaiting 47m (x3 over 53m) operator-lifecycle-manager installing: Waiting: waiting for deployment hostpath-provisioner-operator to become ready: Waiting for rollout to finish: 0 out of 1 new replicas have been updated... Normal InstallWaiting 46m (x7 over 54m) operator-lifecycle-manager installing: Waiting: waiting for deployment hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available... Warning InstallCheckFailed 44m operator-lifecycle-manager install failed: Timeout: deployment hostpath-provisioner-operator not ready before timeout: deployment "hostpath-provisioner-operator" exceeded its progress deadline >> oc get csv -n openshift-cnv | grep povisioner >> >> oc get crd --all-namespaces| grep provisioner hostpathprovisioners.hostpathprovisioner.kubevirt.io 2020-02-18T12:00:46Z Version: rh-osbs/contain r-native-virtualization-hyperconverged-cluster-operator@sha256:9457c90f1f52b2a382b5b0d6c798befd86fdb469048eb55441908c621645d2b9 Created At 2020-02-17 19:19:26
(In reply to Irina Gulina from comment #5) > No idea if it's related, but I have just deployed 2.3 and I don't see HPP > pod there. 2.3 csv reports a Failure: If you check the hostpath provisioner operator deployment you will see an error in there, we set the runAsUser to 1000 which apparently is not valid in openshift. I created a fix for this upstream, and its propagating it way into a build. If you want to fix it now, edit the hpp operator deployment and set the runAsUser to 1000590000 > > >> oc get csv -n openshift-cnv > NAME DISPLAY > VERSION REPLACES PHASE > kubevirt-hyperconverged-operator.v2.3.0 Container-native virtualization > 2.3.0 kubevirt-hyperconverged-operator.v2.2.0 Failed > > >> oc describe csv > > Type Reason Age From > Message > ---- ------ ---- ---- > ------- > Normal RequirementsUnknown 54m (x2 over 54m) > operator-lifecycle-manager requirements not yet checked > Normal RequirementsNotMet 54m (x2 over 54m) > operator-lifecycle-manager one or more requirements couldn't be found > Normal InstallWaiting 53m (x2 over 53m) > operator-lifecycle-manager installing: Waiting: waiting for deployment > virt-operator to become ready: Waiting for rollout to finish: 1 of 2 updated > replicas are available... > Warning InstallCheckFailed 49m > operator-lifecycle-manager install timeout > Normal NeedsReinstall 49m (x2 over 49m) > operator-lifecycle-manager installing: Waiting: waiting for deployment > hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated > replicas are available... > Normal AllRequirementsMet 49m (x3 over 54m) > operator-lifecycle-manager all requirements found, attempting install > Normal InstallSucceeded 49m (x4 over 54m) > operator-lifecycle-manager waiting for install components to report healthy > Normal InstallWaiting 47m (x3 over 53m) > operator-lifecycle-manager installing: Waiting: waiting for deployment > hostpath-provisioner-operator to become ready: Waiting for rollout to > finish: 0 out of 1 new replicas have been updated... > Normal InstallWaiting 46m (x7 over 54m) > operator-lifecycle-manager installing: Waiting: waiting for deployment > hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated > replicas are available... > Warning InstallCheckFailed 44m > operator-lifecycle-manager install failed: Timeout: deployment > hostpath-provisioner-operator not ready before timeout: deployment > "hostpath-provisioner-operator" exceeded its progress deadline > > >> oc get csv -n openshift-cnv | grep povisioner > >> > > >> oc get crd --all-namespaces| grep provisioner > hostpathprovisioners.hostpathprovisioner.kubevirt.io > 2020-02-18T12:00:46Z > > Version: rh-osbs/contain > r-native-virtualization-hyperconverged-cluster-operator@sha256: > 9457c90f1f52b2a382b5b0d6c798befd86fdb469048eb55441908c621645d2b9 > Created At 2020-02-17 19:19:26
@Alexander Ok, I will try it, thanks. But when will the fix be merged please? Will it be in 2.3? If not, it should be a known issue doc line in Release notes.
Its been fixed upstream, and I believe D/S builds are ongoing, so it should be in the next test build.
So [0] explains what happened: Originally we just has a runAsNonRoot: true in the security context. If you installed the hostpath provisioner in a namespace that didn't have an SCC configured it would fail with the runAsNonRoot: true set but no runAsUser specified as the SCC didn't automatically add that to the securityContext of the pod. This appears to be the case in the openshift-operators namespace. Then we added a hard coded runAsUser to the spec, but that failed because the SCCs range can change between clusters, so hard coding is not a good plan either. Currently we don't specify a securityContext at all, so we let the SCC add one for us. If there is no SCC defined we don't use the securityContext at all. https://docs.openshift.com/container-platform/4.3/authentication/managing-security-context-constraints.html#security-context-constraints-pre-allocated-values_configuring-internal-oauth
To work around this issue, you can edit the hostpath provisioner operator deployment, and remove the securityContext with the runAsNonRoot section from the yaml. This will then allow the operator to properly deploy.
Example of patching openshift-cnv namespace hostpath provisioner deployment: kubectl patch deployment hostpath-provisioner-operator -n openshift-cnv -p '{"spec": {"template": {"spec": {"securityContext": {"$patch": "delete", "name": "securityContext"}}}}}' if you have a different namespace (project) replace -n openshift-cnv with -n <your namespace>
This will be fixed in 2.3.
Tested with hostpath-provisioner-operator-container-v2.3.0-12. SCC wasn't defined, HPP deployment securityContext without explicit runAs setting. HPP deployed successfully. I found if I deployed https://github.com/kubevirt/hostpath-provisioner-operator/blob/master/deploy/operator.yaml in openshift-operators NS, I got "unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 1000590000: must be in the ranges: [1000110000, 1000119999]]'". Then I remove the value of securityContext and HPP deployed successfully. <---snip---> securityContext: runAsNonRoot: true runAsUser: 1000590000 serviceAccount: hostpath-provisioner-operator serviceAccountName: hostpath-provisioner-operator <---snip---> - lastTransitionTime: "2020-04-13T09:19:04Z" lastUpdateTime: "2020-04-13T09:19:04Z" message: 'pods "hostpath-provisioner-operator-55445c57c-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 1000590000: must be in the ranges: [1000110000, 1000119999]]' reason: FailedCreate status: "True" type: ReplicaFailure <---snip---> I'm curious why did we set 1000590000 in upstream?
Because I hadn't updated the example deployment upstream yet, I just fixed it.
I'd like to let https://github.com/kubevirt/hostpath-provisioner-operator/issues/62 track upstream fix, and move this d/s bug to VERIFIED. thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:2011