Bug 1913069 - pod fails to create with "Host network is not allowed to be used spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 10301" [NEEDINFO]
Summary: pod fails to create with "Host network is not allowed to be used spec.contain...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.7
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: David Eads
QA Contact: scheng
URL:
Whiteboard: LifecycleReset
: 1930713 1930715 1932007 (view as bug list)
Depends On:
Blocks: 1928839
TreeView+ depends on / blocked
 
Reported: 2021-01-05 21:41 UTC by jamo luhrsen
Modified: 2021-04-29 18:12 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
[sig-auth][Feature:SCC][Early] should not have pod creation failures during install [Suite:openshift/conformance/parallel]
Last Closed: 2021-04-29 18:12:03 UTC
Target Upstream Version:
Embargoed:
mfojtik: needinfo?


Attachments (Terms of Use)

Description jamo luhrsen 2021-01-05 21:41:42 UTC
Description of problem:

This test case:
  "[sig-auth][Feature:SCC][Early] should not have pod creation failures during install"

the build log failure message seems to indicate some problem with using a specific host port (10300):

Error creating: pods "aws-ebs-csi-driver-node-" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[1].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[1].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used spec.containers[2].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[2].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used] for DaemonSet.apps/v1/aws-ebs-csi-driver-node -n openshift-cluster-csi-drivers happened 3 times

this is coming from this specific job:
https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.7/1346511357926707200

hostPort: 10300 is defined in CSI pods:
$ rg hostPort * | rg 'hostPort: 10300'
namespaces/openshift-cluster-csi-drivers/apps/daemonsets.yaml:            hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/core/pods.yaml:        hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-684s9/aws-ebs-csi-driver-node-684s9.yaml:      hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-dhmh8/aws-ebs-csi-driver-node-dhmh8.yaml:      hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-lmbb8/aws-ebs-csi-driver-node-lmbb8.yaml:      hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-trbv4/aws-ebs-csi-driver-node-trbv4.yaml:      hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-tx7pc/aws-ebs-csi-driver-node-tx7pc.yaml:      hostPort: 10300
namespaces/openshift-cluster-csi-drivers/pods/aws-ebs-csi-driver-node-r5zjp/aws-ebs-csi-driver-node-r5zjp.yaml:      hostPort: 10300

the same job will also have similar errors around port 10301 and that is also being used in CSI pods.

is failing periodically and when it does it will block release payloads from being accepted, like this:
https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.7.0-0.nightly/release/4.7.0-0.nightly-2021-01-05-104202

must-gather for this job:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.7/1346511357926707200/artifacts/e2e-aws/must-gather.tar

testgrid shows it failing 3 out of last 5 times:
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-blocking#release-openshift-ocp-installer-e2e-aws-4.7&grid=old

Version-Release number of selected component (if applicable):

search.ci shows it in 4.7 and 4.6. I did not check other versions:
https://search.ci.openshift.org/?search=Invalid+value%3A+10301%3A+Host+ports+are+not+allowed+to+be+used&maxAge=48h&context=1&type=build-log&name=4.6&maxMatches=5&maxBytes=20971520&groupBy=job
https://search.ci.openshift.org/?search=Invalid+value%3A+10301%3A+Host+ports+are+not+allowed+to+be+used&maxAge=48h&context=1&type=build-log&name=4.7&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 1 jamo luhrsen 2021-01-05 21:42:37 UTC
Storage/Storage component/subcoponent is just my best guess. Please adjust as needed.

Comment 2 Jan Safranek 2021-01-13 13:12:28 UTC
The test should be more tolerant to errors. Fix: https://github.com/openshift/kubernetes/pull/391

Comment 3 jamo luhrsen 2021-01-21 18:04:54 UTC
(In reply to Jan Safranek from comment #2)
> The test should be more tolerant to errors. Fix:
> https://github.com/openshift/kubernetes/pull/391

I see the PR is sitting idle for a week now. I came across this problem again today while trying
to debug some 4.7 failures:

https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ocp-4.7-e2e-aws-proxy/1352181772875468800

top level failure is about crashlooping pods. in the test log you can see this:

  Jan 21 10:49:33.608: INFO: At 2021-01-21 09:30:40 +0000 UTC - event for aws-ebs-csi-driver-controller-595bc5b465: {replicaset-controller } FailedCreate: Error creating: pods "aws-ebs-csi-driver-controller-595bc5b465-" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[1].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[1].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[2].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[2].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[3].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[3].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[4].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[4].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[5].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[5].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used]

test log:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ocp-4.7-e2e-aws-proxy/1352181772875468800/artifacts/e2e-aws-proxy/openshift-e2e-test/e2e.log

Comment 6 Standa Laznicka 2021-03-01 08:58:29 UTC
*** Bug 1932007 has been marked as a duplicate of this bug. ***

Comment 7 Standa Laznicka 2021-03-01 08:59:30 UTC
*** Bug 1930715 has been marked as a duplicate of this bug. ***

Comment 8 Micah Abbott 2021-03-16 15:12:01 UTC
Still seeing this during recent CI runs

https://sippy.ci.openshift.org/?release=4.7

```
Error creating: pods "cloud-credential-operator-5675cb8c55-" is forbidden: unable to validate against any security context constraint: [] for ReplicaSet.apps/v1/cloud-credential-operator-5675cb8c55 -n openshift-cloud-credential-operator happened 1 times
Error creating: pods "aws-ebs-csi-driver-controller-6d75566766-" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[1].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[1].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[2].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[2].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[3].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[3].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[4].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[4].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used spec.containers[5].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[5].securityContext.containers[0].hostPort: Invalid value: 10301: Host ports are not allowed to be used] for ReplicaSet.apps/v1/aws-ebs-csi-driver-controller-6d75566766 -n openshift-cluster-csi-drivers happened 4 times
Error creating: pods "console-operator-88868b75f-" is forbidden: unable to validate against any security context constraint: [] for ReplicaSet.apps/v1/console-operator-88868b75f -n openshift-console-operator happened 11 times
Error creating: pods "downloads-5bb6748bc5-" is forbidden: unable to validate against any security context constraint: [] for ReplicaSet.apps/v1/downloads-5bb6748bc5 -n openshift-console happened 11 times
Error creating: pods "router-default-57974b7f5b-" is forbidden: unable to validate against any security context constraint: [] for ReplicaSet.apps/v1/router-default-57974b7f5b -n openshift-ingress happened 11 times
Error creating: pods "marketplace-operator-d98c89b9c-" is forbidden: unable to validate against any security context constraint: [] for ReplicaSet.apps/v1/marketplace-operator-d98c89b9c -n openshift-marketplace happened 11 times
```

Comment 9 Standa Laznicka 2021-03-19 11:43:34 UTC
*** Bug 1930713 has been marked as a duplicate of this bug. ***

Comment 10 Michael McCune 2021-03-30 13:38:00 UTC
i think we have also hit this bug during the ci tests on this pr https://github.com/openshift/machine-api-operator/pull/830


i see event error output in the must-gather that is similar to what is described in this bug:

```
Error creating: pods "aws-ebs-csi-driver-node-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used, spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, spec.containers[1].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, spec.containers[1].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used, spec.containers[2].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, spec.containers[2].securityContext.containers[0].hostPort: Invalid value: 10300: Host ports are not allowed to be used, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
```

Comment 11 Michal Fojtik 2021-04-29 14:06:10 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 12 David Eads 2021-04-29 14:10:06 UTC
I think we fixed this sometime in 4.7

Comment 13 W. Trevor King 2021-04-29 17:32:42 UTC
Are you sure?  Certainly looks like 4.7 is still impacted:

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=Invalid+value%3A+10301%3A+Host+ports+are+not+allowed+to+be+used&maxAge=24h&type=junit' | grep 'failures match' | sort
periodic-ci-openshift-release-master-ci-4.6-e2e-aws-upgrade-rollback (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
periodic-ci-openshift-release-master-ci-4.6-upgrade-from-stable-4.5-e2e-aws-ovn-upgrade (all) - 10 runs, 100% failed, 20% of failures match = 20% impact
periodic-ci-openshift-release-master-ci-4.7-upgrade-from-stable-4.6-e2e-aws-ovn-upgrade (all) - 15 runs, 33% failed, 20% of failures match = 7% impact
periodic-ci-openshift-release-master-nightly-4.6-e2e-aws-fips (all) - 7 runs, 29% failed, 250% of failures match = 71% impact
periodic-ci-openshift-release-master-nightly-4.6-e2e-aws-proxy (all) - 7 runs, 100% failed, 86% of failures match = 86% impact
periodic-ci-openshift-release-master-nightly-4.6-e2e-ovirt (all) - 9 runs, 78% failed, 114% of failures match = 89% impact
periodic-ci-openshift-release-master-nightly-4.7-e2e-aws (all) - 15 runs, 40% failed, 17% of failures match = 7% impact
periodic-ci-openshift-release-master-nightly-4.7-e2e-aws-proxy (all) - 8 runs, 50% failed, 25% of failures match = 13% impact
periodic-ci-openshift-release-master-nightly-4.7-e2e-aws-serial (all) - 19 runs, 63% failed, 17% of failures match = 11% impact
periodic-ci-openshift-release-master-nightly-4.7-e2e-aws-upgrade (all) - 8 runs, 75% failed, 17% of failures match = 13% impact
periodic-ci-openshift-release-master-nightly-4.7-e2e-ovirt (all) - 12 runs, 33% failed, 25% of failures match = 8% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-ovirt (all) - 11 runs, 45% failed, 20% of failures match = 9% impact
pull-ci-cri-o-cri-o-release-1.19-e2e-aws (all) - 4 runs, 25% failed, 200% of failures match = 50% impact
pull-ci-openshift-cluster-api-provider-aws-release-4.7-e2e-aws-upgrade (all) - 2 runs, 100% failed, 50% of failures match = 50% impact
pull-ci-openshift-cluster-dns-operator-release-4.7-e2e-aws (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
pull-ci-openshift-cluster-network-operator-release-4.7-e2e-ovn-hybrid-step-registry (all) - 6 runs, 83% failed, 20% of failures match = 17% impact
pull-ci-openshift-installer-release-4.7-e2e-aws (all) - 4 runs, 50% failed, 50% of failures match = 25% impact
pull-ci-openshift-multus-cni-release-4.7-e2e-aws (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
release-openshift-ocp-installer-e2e-aws-mirrors-4.6 (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
release-openshift-ocp-installer-e2e-aws-ovn-4.6 (all) - 7 runs, 29% failed, 250% of failures match = 71% impact
release-openshift-ocp-installer-e2e-aws-upi-4.6 (all) - 7 runs, 86% failed, 17% of failures match = 14% impact
release-openshift-origin-installer-e2e-aws-sdn-network-stress-4.7 (all) - 3 runs, 67% failed, 100% of failures match = 67% impact

The 4.8 match there is [1].  But can whatever is helping 4.8 so much get ported back to 4.7?

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-ovirt/1387613619692244992

Comment 14 Michal Fojtik 2021-04-29 18:06:14 UTC
The LifecycleStale keyword was removed because the bug got commented on recently.
The bug assignee was notified.

Comment 15 David Eads 2021-04-29 18:12:03 UTC
Ok. 4.8. It's not worth the backport.  Behavior always worked fine eventually, this was just a little noise.


Note You need to log in before you can comment on or make changes to this bug.