Bug 1803220 - csv-generator for hostpath-provisioner-operator creates invalid deployment spec
Summary: csv-generator for hostpath-provisioner-operator creates invalid deployment spec
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.3.0
Assignee: Alexander Wels
QA Contact: Qixuan Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-14 17:54 UTC by Adam Litke
Modified: 2023-12-15 17:22 UTC (History)
9 users (show)

Fixed In Version: hco-bundle-registry-container-v2.3.0-56 hostpath-provisioner-operator-container-v2.3.0-12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 19:10:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hostpath-provisioner-operator pull 51 0 None closed Run as user 1000, no need to be root. 2021-01-18 10:28:17 UTC
Github kubevirt hostpath-provisioner-operator pull 52 0 None closed [release-v0.2] Run as user 1000, no need to be root. 2021-01-18 10:28:17 UTC
Github kubevirt hostpath-provisioner-operator pull 53 0 None closed Make user 1000590000 instead of 1000 to make it openshift compatible. 2021-01-18 10:28:17 UTC
Github kubevirt hostpath-provisioner-operator pull 54 0 None closed [release-v0.2] Make user 1000590000 instead of 1000 to make it openshift compatible. 2021-01-18 10:28:18 UTC
Github kubevirt hostpath-provisioner-operator pull 55 0 None closed [release-v0.3] Make user 1000590000 instead of 1000 to make it openshift compatible. 2021-01-18 10:28:18 UTC
Red Hat Issue Tracker CNV-12102 0 None None None 2023-12-15 17:22:16 UTC
Red Hat Product Errata RHEA-2020:2011 0 None None None 2020-05-04 19:11:06 UTC

Description Adam Litke 2020-02-14 17:54:40 UTC
Description of problem:
The csv-generator for the hostpath-provisioner-operator produces an invalid Deployment spec because it sets RunAsNonRoot: true but does not specify the uid to run as.  The result is that hpp will not deploy properly when installed by the CNV HCO


Version-Release number of selected component (if applicable): 2.2.0


How reproducible: Always

Steps to Reproduce:
1. Install CNV-2.2 using HCO
2. Create HPP CR
3.

Actual results:

HPP operator pod fails with the following message:
"Error: container has runAsNonRoot and image will run as root"


Expected results:
Successful HPP deployment


Additional info:

Comment 2 Alexander Wels 2020-02-14 18:32:02 UTC
PRs to fix this, and the backport.

Comment 3 Adam Litke 2020-02-14 20:06:08 UTC
Both PRs merged.

Comment 5 Irina Gulina 2020-02-18 13:27:34 UTC
No idea if it's related, but I have just deployed 2.3 and I don't see HPP pod there. 2.3 csv reports a Failure:

>> oc get csv -n openshift-cnv
NAME                                      DISPLAY                           VERSION   REPLACES                                  PHASE
kubevirt-hyperconverged-operator.v2.3.0   Container-native virtualization   2.3.0     kubevirt-hyperconverged-operator.v2.2.0   Failed

>> oc describe csv 

      Type     Reason               Age                From                        Message
      ----     ------               ----               ----                        -------
      Normal   RequirementsUnknown  54m (x2 over 54m)  operator-lifecycle-manager  requirements not yet checked
      Normal   RequirementsNotMet   54m (x2 over 54m)  operator-lifecycle-manager  one or more requirements couldn't be found
      Normal   InstallWaiting       53m (x2 over 53m)  operator-lifecycle-manager  installing: Waiting: waiting for deployment virt-operator to become ready: Waiting for rollout to finish: 1 of 2 updated replicas are available...
      Warning  InstallCheckFailed   49m                operator-lifecycle-manager  install timeout
      Normal   NeedsReinstall       49m (x2 over 49m)  operator-lifecycle-manager  installing: Waiting: waiting for deployment hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
      Normal   AllRequirementsMet   49m (x3 over 54m)  operator-lifecycle-manager  all requirements found, attempting install
      Normal   InstallSucceeded     49m (x4 over 54m)  operator-lifecycle-manager  waiting for install components to report healthy
      Normal   InstallWaiting       47m (x3 over 53m)  operator-lifecycle-manager  installing: Waiting: waiting for deployment hostpath-provisioner-operator to become ready: Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
      Normal   InstallWaiting       46m (x7 over 54m)  operator-lifecycle-manager  installing: Waiting: waiting for deployment hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
      Warning  InstallCheckFailed   44m                operator-lifecycle-manager  install failed: Timeout: deployment hostpath-provisioner-operator not ready before timeout: deployment "hostpath-provisioner-operator" exceeded its progress deadline

>> oc get csv -n openshift-cnv | grep povisioner
>>

>> oc get crd --all-namespaces| grep provisioner
hostpathprovisioners.hostpathprovisioner.kubevirt.io             2020-02-18T12:00:46Z

Version: rh-osbs/contain r-native-virtualization-hyperconverged-cluster-operator@sha256:9457c90f1f52b2a382b5b0d6c798befd86fdb469048eb55441908c621645d2b9
Created At 2020-02-17 19:19:26

Comment 6 Alexander Wels 2020-02-18 13:31:53 UTC
(In reply to Irina Gulina from comment #5)
> No idea if it's related, but I have just deployed 2.3 and I don't see HPP
> pod there. 2.3 csv reports a Failure:

If you check the hostpath provisioner operator deployment you will see an error in there, we set the runAsUser to 1000 which apparently is not valid in openshift. I created a fix for this upstream, and its propagating it way into a build. If you want to fix it now, edit the hpp operator deployment and set the runAsUser to 1000590000

> 
> >> oc get csv -n openshift-cnv
> NAME                                      DISPLAY                          
> VERSION   REPLACES                                  PHASE
> kubevirt-hyperconverged-operator.v2.3.0   Container-native virtualization  
> 2.3.0     kubevirt-hyperconverged-operator.v2.2.0   Failed
> 
> >> oc describe csv 
> 
>       Type     Reason               Age                From                 
> Message
>       ----     ------               ----               ----                 
> -------
>       Normal   RequirementsUnknown  54m (x2 over 54m) 
> operator-lifecycle-manager  requirements not yet checked
>       Normal   RequirementsNotMet   54m (x2 over 54m) 
> operator-lifecycle-manager  one or more requirements couldn't be found
>       Normal   InstallWaiting       53m (x2 over 53m) 
> operator-lifecycle-manager  installing: Waiting: waiting for deployment
> virt-operator to become ready: Waiting for rollout to finish: 1 of 2 updated
> replicas are available...
>       Warning  InstallCheckFailed   49m               
> operator-lifecycle-manager  install timeout
>       Normal   NeedsReinstall       49m (x2 over 49m) 
> operator-lifecycle-manager  installing: Waiting: waiting for deployment
> hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated
> replicas are available...
>       Normal   AllRequirementsMet   49m (x3 over 54m) 
> operator-lifecycle-manager  all requirements found, attempting install
>       Normal   InstallSucceeded     49m (x4 over 54m) 
> operator-lifecycle-manager  waiting for install components to report healthy
>       Normal   InstallWaiting       47m (x3 over 53m) 
> operator-lifecycle-manager  installing: Waiting: waiting for deployment
> hostpath-provisioner-operator to become ready: Waiting for rollout to
> finish: 0 out of 1 new replicas have been updated...
>       Normal   InstallWaiting       46m (x7 over 54m) 
> operator-lifecycle-manager  installing: Waiting: waiting for deployment
> hco-operator to become ready: Waiting for rollout to finish: 0 of 1 updated
> replicas are available...
>       Warning  InstallCheckFailed   44m               
> operator-lifecycle-manager  install failed: Timeout: deployment
> hostpath-provisioner-operator not ready before timeout: deployment
> "hostpath-provisioner-operator" exceeded its progress deadline
> 
> >> oc get csv -n openshift-cnv | grep povisioner
> >>
> 
> >> oc get crd --all-namespaces| grep provisioner
> hostpathprovisioners.hostpathprovisioner.kubevirt.io            
> 2020-02-18T12:00:46Z
> 
> Version: rh-osbs/contain
> r-native-virtualization-hyperconverged-cluster-operator@sha256:
> 9457c90f1f52b2a382b5b0d6c798befd86fdb469048eb55441908c621645d2b9
> Created At 2020-02-17 19:19:26

Comment 7 Irina Gulina 2020-02-19 12:05:28 UTC
@Alexander Ok, I will try it, thanks. But when will the fix be merged please? Will it be in 2.3? 
If not, it should be a known issue doc line in Release notes.

Comment 8 Alexander Wels 2020-02-19 12:11:36 UTC
Its been fixed upstream, and I believe D/S builds are ongoing, so it should be in the next test build.

Comment 15 Alexander Wels 2020-03-18 14:13:08 UTC
So [0] explains what happened:

Originally we just has a runAsNonRoot: true in the security context. If you installed the hostpath provisioner in a namespace that didn't have an SCC configured it would fail with the runAsNonRoot: true set but no runAsUser specified as the SCC didn't automatically add that to the securityContext of the pod. This appears to be the case in the openshift-operators namespace.

Then we added a hard coded runAsUser to the spec, but that failed because the SCCs range can change between clusters, so hard coding is not a good plan either.

Currently we don't specify a securityContext at all, so we let the SCC add one for us. If there is no SCC defined we don't use the securityContext at all.


https://docs.openshift.com/container-platform/4.3/authentication/managing-security-context-constraints.html#security-context-constraints-pre-allocated-values_configuring-internal-oauth

Comment 16 Alexander Wels 2020-03-26 15:08:40 UTC
To work around this issue, you can edit the hostpath provisioner operator deployment, and remove the securityContext with the runAsNonRoot section from the yaml. This will then allow the operator to properly deploy.

Comment 17 Alexander Wels 2020-03-26 19:02:21 UTC
Example of patching openshift-cnv namespace hostpath provisioner deployment:

kubectl patch deployment hostpath-provisioner-operator -n openshift-cnv -p '{"spec": {"template": {"spec": {"securityContext": {"$patch": "delete", "name": "securityContext"}}}}}'

if you have a different namespace (project) replace -n openshift-cnv with -n <your namespace>

Comment 20 Adam Litke 2020-03-27 17:07:44 UTC
This will be fixed in 2.3.

Comment 23 Qixuan Wang 2020-04-13 09:31:00 UTC
Tested with hostpath-provisioner-operator-container-v2.3.0-12. SCC wasn't defined, HPP deployment securityContext without explicit runAs setting. HPP deployed successfully.


I found if I deployed https://github.com/kubevirt/hostpath-provisioner-operator/blob/master/deploy/operator.yaml in openshift-operators NS, I got "unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 1000590000: must be in the ranges: [1000110000, 1000119999]]'". Then I remove the value of securityContext and HPP deployed successfully.


<---snip--->
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000590000
      serviceAccount: hostpath-provisioner-operator
      serviceAccountName: hostpath-provisioner-operator
<---snip--->
  - lastTransitionTime: "2020-04-13T09:19:04Z"
    lastUpdateTime: "2020-04-13T09:19:04Z"
    message: 'pods "hostpath-provisioner-operator-55445c57c-" is forbidden: unable
      to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser:
      Invalid value: 1000590000: must be in the ranges: [1000110000, 1000119999]]'
    reason: FailedCreate
    status: "True"
    type: ReplicaFailure
<---snip--->



I'm curious why did we set 1000590000 in upstream?

Comment 24 Alexander Wels 2020-04-13 11:38:25 UTC
Because I hadn't updated the example deployment upstream yet, I just fixed it.

Comment 25 Qixuan Wang 2020-04-21 09:32:17 UTC
I'd like to let https://github.com/kubevirt/hostpath-provisioner-operator/issues/62 track upstream fix, and move this d/s bug to VERIFIED. thanks.

Comment 28 errata-xmlrpc 2020-05-04 19:10:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2011


Note You need to log in before you can comment on or make changes to this bug.