Bug 2084433

Summary: Podsecurity violation error getting logged for ingresscontroller during deployment.
Product: OpenShift Container Platform Reporter: Arvind iyengar <aiyengar>
Component: NetworkingAssignee: Miciah Dashiel Butler Masters <mmasters>
Networking sub component: router QA Contact: Arvind iyengar <aiyengar>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: mmasters
Version: 4.11   
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 11:11:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Arvind iyengar 2022-05-12 06:40:18 UTC
Description of problem:
In the current 4.11 images, it is observed that the ingress operator has PodSecurity violation logs for the ingresscontroller and canary during deployment. This appears to be due to the PodSecurity feature gate being active by default for v1.23 k8s release and from 4.11 pod security admission level gets set as "restricted" by default. Ref: https://kubernetes.io/docs/concepts/security/pod-security-admission/

Version-Release number of selected component (if applicable):
v4.11 nightly

How reproducible:
frequently

Actual results:
1. Check for "PrivilegeEscalation" warnings in the ingress-operator logs for the default router pods or after deploying a custom controller. 

Errors:
2022-05-12T03:40:47.669Z	INFO	operator.init.KubeAPIWarningLogger	rest/warnings.go:144	would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "serve-healthcheck-canary" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "serve-healthcheck-canary" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "serve-healthcheck-canary" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "serve-healthcheck-canary" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2022-05-12T03:42:05.326Z	INFO	operator.init.KubeAPIWarningLogger	rest/warnings.go:144	would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true), hostPort (container "router" uses hostPorts 1936, 443, 80), allowPrivilegeEscalation != false (container "router" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "router" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "router" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "router" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2022-05-12T04:47:59.809Z	INFO	operator.init.KubeAPIWarningLogger	rest/warnings.go:144	would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true), hostPort (container "router" uses hostPorts 1936, 443, 80), allowPrivilegeEscalation != false (containers "router", "logs" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "router", "logs" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or containers "router", "logs" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "router", "logs" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

As per the documentation, this function appears to work in conjunction with the namespace level labels and places requirements on a Pod's Security Context and other related fields. Currently, there seem to be no labels defined for the ingress operator and operand ns:

# oc get ns openshift-ingress -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/node-selector: ""
    openshift.io/sa.scc.mcs: s0:c25,c0
    openshift.io/sa.scc.supplemental-groups: 1000600000/10000
    openshift.io/sa.scc.uid-range: 1000600000/10000
    workload.openshift.io/allowed: management
  creationTimestamp: "2022-05-12T03:32:42Z"
  labels:
    kubernetes.io/metadata.name: openshift-ingress
    name: openshift-ingress
    network.openshift.io/policy-group: ingress
    olm.operatorgroup.uid/64277ac8-5484-49db-8aad-089ec4505e0b: ""
    openshift.io/cluster-monitoring: "true"
    policy-group.network.openshift.io/ingress: ""
  name: openshift-ingress
  resourceVersion: "13401"
  uid: 008615e5-d9f7-4f64-8a53-6403cfbf94bb
spec:
  finalizers:
  - kubernetes
status:
  phase: Active


#oc get ns openshift-ingress-operator -o yaml          
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    openshift.io/node-selector: ""
    openshift.io/sa.scc.mcs: s0:c11,c10
    openshift.io/sa.scc.supplemental-groups: 1000130000/10000
    openshift.io/sa.scc.uid-range: 1000130000/10000
    workload.openshift.io/allowed: management
  creationTimestamp: "2022-05-12T03:27:55Z"
  labels:
    kubernetes.io/metadata.name: openshift-ingress-operator
    olm.operatorgroup.uid/64277ac8-5484-49db-8aad-089ec4505e0b: ""
    openshift.io/cluster-monitoring: "true"
  name: openshift-ingress-operator
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: ce2961ef-9778-4a6e-aa21-3b3059e21ed2
  resourceVersion: "7393"
  uid: 72eb1e90-5200-447c-a7e5-68461d864abb
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

Whereas the deployed pods have "allow privilege escalation" fields set to "true"
-----
openshift-ingress pod deploy contexts: 
    spec:
      containers:
        readinessProbe:
          failureThreshold: 3
          httpGet:
            host: localhost
            path: /healthz/ready
            port: 1936
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
        securityContext:
          allowPrivilegeEscalation: true
      ......
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      nodeSelector:
        kubernetes.io/os: linux
        node-role.kubernetes.io/worker: ""
      priorityClassName: system-cluster-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: router
      serviceAccountName: router
      terminationGracePeriodSeconds: 3600
      topologySpreadConstraints:
-----

Expected results:
The security context needs to be there should not be any warnings.


Additional info:

Comment 3 Arvind iyengar 2022-05-18 12:41:03 UTC
Verified in "4.11.0-0.nightly-2022-05-18-010528" release version. There are no more "podsecurity" errors noted during ingresscontroller pod creation and it is observed the canary and router resources have the security context set properly:
-------
oc get clusterversion                          
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-05-18-010528   True        False         3h35m   Cluster version is 4.11.0-0.nightly-2022-05-18-010528

oc -n openshift-ingress-operator get pods -o wide 
NAME                                READY   STATUS    RESTARTS       AGE     IP            NODE                        NOMINATED NODE   READINESS GATES
ingress-operator-54c67bbfdf-2dwj4   2/2     Running   1 (4h1m ago)   4h12m   10.128.0.32   aiyengartq-xtngl-master-0   <none>           <none>

oc -n openshift-ingress-operator logs pod/ingress-operator-54c67bbfdf-2dwj4 -c ingress-operator  | grep -i "podsecurity" | wc -l
0

oc get ns openshift-ingress -o yaml        
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/node-selector: ""
    openshift.io/sa.scc.mcs: s0:c24,c19
    openshift.io/sa.scc.supplemental-groups: 1000590000/10000
    openshift.io/sa.scc.uid-range: 1000590000/10000
    workload.openshift.io/allowed: management
  creationTimestamp: "2022-05-18T08:33:46Z"
  labels:
    kubernetes.io/metadata.name: openshift-ingress
    name: openshift-ingress
    network.openshift.io/policy-group: ingress
    olm.operatorgroup.uid/1f62d689-46a4-49e1-8fc4-8260c31d95e9: ""
    openshift.io/cluster-monitoring: "true"
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/warn: privileged
    policy-group.network.openshift.io/ingress: ""
  name: openshift-ingress
  resourceVersion: "16153"
  uid: 0e212e1c-705e-4789-98f4-86ac5bf3a201
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

oc -n  openshift-ingress-canary get daemonset -o yaml
apiVersion: v1
items:
- apiVersion: apps/v1
  kind: DaemonSet
  metadata:
    annotations:
      deprecated.daemonset.template.generation: "1"
    creationTimestamp: "2022-05-18T08:38:05Z"
....
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
....
        securityContext:
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
...
-------

Comment 5 errata-xmlrpc 2022-08-10 11:11:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069