Bug 1816475 - Request to change default whitelist capability handling in Security Context Capability (SCC) definitions.
Summary: Request to change default whitelist capability handling in Security Context C...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Stefan Schimanski
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-24 04:50 UTC by Mark Rooks
Modified: 2021-03-29 01:32 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-19 10:50:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mark Rooks 2020-03-24 04:50:45 UTC
Description of problem:

Request to change default whitelist capability handling in Security Context Capability (SCC) definitions.

Version-Release number of selected component (if applicable):

OCP 4.3

How reproducible:
Always

Steps to Reproduce:

1. Log on to an OCP cluster. 
2. Drop ALL capabilities in a Container SecurityContext
3. Add back in the specific capabilities required for the workload: assign a __native__ SecurityContextConstraint to the ServiceAccount that the Pod will be scheduled using.
4. Drop ALL capabilities from the SecurityContext and add the specific ones required back in:

containers:
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
        add:
        - KILL
        - CHOWN

5. Schedule the Pod for deployment using:
$ oc apply -f 
6. The Pod fails to schedule, because *all* OpenShift native SCCs (excluding privileged) do not have any capabilities listed in the `allowedCapabilities` field of the SCC
7. In the above example, I would need to Drop every capability that is not `KILL` or `CHOWN` instead.

Actual results:
The Pod fails to schedule, because *all* OpenShift native SCCs (excluding privileged) do not have any capabilities listed in the `allowedCapabilities` field of the SCC.

Expected results:

The `allowedCapabilities` object should be logically populated, the pod should then schedule as expected after applying any allowed customisation, using `$ oc apply -f`.

Additional info:
SFDC Case 02609195

Comment 1 Mark Rooks 2020-03-24 04:53:54 UTC
I would expect for SecurityContextConstraints and the Pod Admissions controller to have some way of determining what the default capabilities are for the kubelet of the Node the Pod is to be scheduled on.  This would give the ability to Drop ALL capabilities and add back in only the ones that are needed by the workload.

Alternatively, all of the namespace-safe capabilities \could be added to the default SCCs shipped with Red Hat OpenShift, so that the drop functionality can be achieved with the platform defaults.  

Benefits of Enhancement

Allowing the defaults to be added into the white list of allowed capabilities by default would give the following benefits:

1. Without Drop ALL, if a new capability is added to the default set, then this is a threat vector for the workload because more capabilities have been opened up in the container
2. Dropping ALL capabilities and adding back in the ones needed is more explicit.  It is the most declarative way to define the security context of a container with respect to the capabilities it has
3. If a default capability was to change, it would be automatically picked up and enforced by the platform,  instead of having to change the container spec definitions for each Pod in the workload and shipping a new version of a product

Dropping all capabilities and adding back only those required to run the workload is the most prescriptive way to declare what capabilities your container needs.  For the principle of least privilege to be followed as a best practice, then this format of declaring container capabilities must be followed.  Otherwise the risk is run that new capabilities could be opened up in the container, and this exposes a threat vector.  

IMPACT:

This impacts 75 Helm operators currently in service.

Comment 2 Mark Rooks 2020-03-24 05:00:10 UTC
NOTE: The default Helm 3 default template says to drop all capabilties

Comment 3 Mark Rooks 2020-03-24 05:05:16 UTC
CU suggested 

Either have the Pod Admissions controller to have some way of determining what the default capabilities are for the kubelet of the Node the Pod is to be scheduled on.  This would give the ability to Drop ALL capabilities and add back in only the ones that are needed by the workload.

Or; 

All of the namespace-safe capabilities could be added to the default SCCs shipped with OCP, so that drop functionality can be achieved with the platform defaults.

Comment 4 Maciej Szulik 2020-03-24 11:21:52 UTC
This is a question to apiserver folks, redirecting accordingly.

Comment 6 Stefan Schimanski 2020-05-19 10:50:25 UTC
The apiserver cannot know the default set of capabilities. Hence, it cannot verify whether the newly added ones are included in the dropped ones. This is by design and cannot be changed.

The discussion is ongoing in the RFE. Closing here.


Note You need to log in before you can comment on or make changes to this bug.