Bug 1699324
Summary: | improper certificates, can crush a cluster | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Eric Rich <erich> |
Component: | Networking | Assignee: | Dan Mace <dmace> |
Networking sub component: | router | QA Contact: | Hongan Li <hongli> |
Status: | CLOSED NOTABUG | Docs Contact: | |
Severity: | low | ||
Priority: | low | CC: | aos-bugs, bbennett, mkhan |
Version: | 4.1.0 | Keywords: | NeedsTestCase |
Target Milestone: | --- | ||
Target Release: | 4.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-07 19:22:10 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1664187 |
Description
Eric Rich
2019-04-12 12:26:19 UTC
Moving to routing as they maintain ingress operator. I'm not sure I agree with characterization of this bug. The claims I see are that providing an "invalid" certificate: 1. "causes the cluster to crash." 2. "This breaks a cluster" Can you more clearly describe what you mean by "crash" and "break"? I've executed your steps and the ingresscontroller happily uses the provided certificate. The ingresscontroller API doesn't describe any validation rules applied to the contents of the certificate secret[1]. Whether or not some validation _should_ be applied seems like a useful discussion, but can we agree that such a discussion should take place in the context of an RFE and not a bug report? [1] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L89 (In reply to Dan Mace from comment #2) > I'm not sure I agree with characterization of this bug. The claims I see are > that providing an "invalid" certificate: > > 1. "causes the cluster to crash." > 2. "This breaks a cluster" > > Can you more clearly describe what you mean by "crash" and "break"? I've > executed your steps and the ingresscontroller happily uses the provided > certificate. > > The ingresscontroller API doesn't describe any validation rules applied to > the contents of the certificate secret[1]. Whether or not some validation > _should_ be applied seems like a useful discussion, but can we agree that > such a discussion should take place in the context of an RFE and not a bug > report? > > [1] > https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L89 Router no longer runs. Without a router authentication no longer works. (In reply to Eric Rich from comment #3) > (In reply to Dan Mace from comment #2) > > I'm not sure I agree with characterization of this bug. The claims I see are > > that providing an "invalid" certificate: > > > > 1. "causes the cluster to crash." > > 2. "This breaks a cluster" > > > > Can you more clearly describe what you mean by "crash" and "break"? I've > > executed your steps and the ingresscontroller happily uses the provided > > certificate. > > > > The ingresscontroller API doesn't describe any validation rules applied to > > the contents of the certificate secret[1]. Whether or not some validation > > _should_ be applied seems like a useful discussion, but can we agree that > > such a discussion should take place in the context of an RFE and not a bug > > report? > > > > [1] > > https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L89 > > Router no longer runs. > Without a router authentication no longer works. I didn't see crash loops when running the test. I am seeing the invalid cert being served. If there's actually a router crash occurring, I would expect status of the ingresscontroller/operator to reflect that. Can you provide the output of must-gather following a crash if that's what you're seeing? Given a reference to a well-formed certificate data in a valid secret reference I would expect the certificate to be used and served as-is, and so far what I see matches those expectations. I do agree that changing the certificate has potentially serious downstream implications, and I think our API docs should more clearly articulate the risks. We can certainly improve the API docs right away. That said, my claim is still that we're conforming to the API we publish. I'll let the rest of the team weigh in here, but my current position is that we need to spec out the behavior in an RFE or story. Can't reproduce a crash, and some sort of validation of the certificate would be an RFE. I'm going to close this one. If I've misunderstood and you have some steps to reproduce a crash, please let me know and re-open. https://jira.coreos.com/browse/RFE-298 filed for this. |