Bug 1701069

Summary: Cincinnati must verify integrity of manifests, layers, and prefer release info from the image
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: Cluster Version OperatorAssignee: Stefan Junker <sjunker>
Status: CLOSED CURRENTRELEASE QA Contact: liujia <jiajliu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, bleanhar, jokerman, mmccomas
Target Milestone: ---Keywords: BetaBlocker
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-02 17:11:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2019-04-17 23:37:45 UTC
This is a bug to ensure we have verified the issue (if Cincinnati already does this we are covered).

In Origin we recently audited access to release images and other tools to ensure we verified the integrity of artifacts fetched: https://github.com/openshift/origin/pull/22558

Cincinnati must treat Quay or a hosting registry as potentially vulnerable and

1. Verify that objects retrieved by digest match their digest (schema2 manifests and blobs via a straight hash of contents, schema1 manifests either must be blocked or the canonical serialization used) and reject the content otherwise
2. Ignore the text value of a tag and always retrieve metadata from the image (the version number must come from release-metadata)

Cincinnati should also be audited to verify any assumptions about the trustworthiness of the registry being scanned - unverified input is the biggest gap.

Finally, we may need to source channel data from outside quay.  We should prepare a mechanism to retrieve quay channels from a flat file delivered separately.  This can be moved to a separate work item but may be required prerelease.

We must verify content pre GA. Other mitigations will need to be triaged.

Comment 1 Stefan Junker 2019-04-18 19:03:50 UTC
I've started to create actionable tasks in Jira [1] for this ticket as far as I understand and agree.

I'm not sure about 

> Cincinnati must treat Quay or a hosting registry as potentially vulnerable

since that's not handled by mere content hashing, but rather requires (end-to-end) signature verification; that has previously been determined to be out of scope for Cincinnati [2].


[1]: https://jira.coreos.com/browse/CIN-27
[2]: https://jira.coreos.com/browse/CIN-26

Comment 2 Stefan Junker 2019-04-25 16:48:41 UTC
Layer digest verification is in flight [1] and manifest digest verification is up next.


Further, something I haven't addressed yet is this part:

> Finally, we may need to source channel data from outside quay.  We should prepare a mechanism to retrieve quay channels from a flat file delivered separately.  This can be moved to a separate work item but may be required prerelease.

This ticket is the first occurrence of this requirement. If we need or want this prerelease then we need to define (or make available to me) the details immediately.  

[1]: https://github.com/camallo/dkregistry-rs/pull/105

Comment 3 Clayton Coleman 2019-04-26 21:05:27 UTC
So, assume quay is compromised.  Channels can be arbitrarily filled with data.  That allows an attacker to take vulnerable releases we had removed from the channel and put them back in, where they will be offered to end users.  So to mitigate that, we want to be able to stop depending on quay channel data at a moments notice.  The exact mechanism is questionable, but it might be:

If quay.io goes down or is compromised, Cincinnati can read channel information from a flat file that is controlled by the hosting service for Cincinnati.

We are currently looking at resiliency stories that allow clusters to pull images from multiple clusters, in which case clusters would be able to fetch images from other registries besides quay, but we won't be able to tell clusters not to trust quay. We would want to be able to serve channels without any dependency on quay (such as a flat text file).

Comment 4 Stefan Junker 2019-04-30 15:43:59 UTC
> So, assume quay is compromised. ...

I created a separate jira ticket for this [1].

> Layer digest verification is in flight [1] 

Layer digest verification has landed on staging and will be bumped to production soon.

> manifest digest verification is up next.

After going through the flow of manifest digest verification I realized that, unlike the CVO, Cincinnati doesn't pull the manifests by reference and thus cannot verify its content.

[1]: https://jira.coreos.com/browse/CIN-36

Comment 5 Stefan Junker 2019-05-02 17:11:54 UTC
The issue as expressed by this ticket's title has been resolved [1], i.e. Cincinnati now verifies the content it downloads by digest matches the requested digest.

I just created another Jira ticket [1] for tracking the following item:

> Cincinnati should also be audited to verify any assumptions about the trustworthiness of the registry being scanned - unverified input is the biggest gap.

The other items which were discussed have all received Jira tickets as linked throughout the discussions.

[1]: https://jira.coreos.com/browse/CIN-37

Comment 6 Stefan Junker 2019-05-02 17:36:05 UTC
Correction to my previous comment cause there's a typo.

The first occurrence of [1] should point to https://jira.coreos.com/browse/CIN-27, which is the jira equivalent of the main issue identified in this ticket. The second occurrence of [1] is linked correctly with https://jira.coreos.com/browse/CIN-37.