Bug 2008119
| Summary: | The serviceAccountIssuer field on Authentication CR is reseted to “” when installation process | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | wang lin <lwan> | |
| Component: | Installer | Assignee: | Matthew Staebler <mstaeble> | |
| Installer sub component: | openshift-installer | QA Contact: | Mike Fiedler <mifiedle> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | urgent | |||
| Priority: | urgent | CC: | aos-bugs, aos-install, lwan, mfojtik, mifiedle, mstaeble, nstielau, scuppett, sdodson, surbania, xxia, yunjiang | |
| Version: | 4.9 | Keywords: | Regression, TestBlocker | |
| Target Milestone: | --- | |||
| Target Release: | 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause: During bootstrapping, there are two components that are both trying to write manifests to the k8s API server. The first is the cluster-bootstrap, which is trying to write manifests supplied by the installer. The second is the cluster-version-operator, which is trying to write manifests from the release image. If there is a manifest supplied by the installer for a resource that also has a manifest in the release image, then there is a race between which manifest will actually be written.
Consequence: If the cluster-bootstrap loses the race to create the Authentication resource, the customizations added by the user are lost.
Fix: Explicitly block the cluster-version-operator from creating the resources that are created by installer manifests. All resources from installer manifests are temporarily added to the ClusterVersion resource as resource to ignore. After bootstrapping, those resources are removed from the ClusterVersion resource ignore list.
Result: Successful installations with user customizations to the Authentication resource retained.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2009342 (view as bug list) | Environment: | ||
| Last Closed: | 2022-03-12 04:38:40 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2009342 | |||
|
Comment 1
Standa Laznicka
2021-09-27 12:07:40 UTC
Please attach the manifests that you are adding. Also, please attach the full install directory, including the log file and the state file. Note that the install of the cluster was not successful. The openshift-apiserver is reporting APIServicesAvailable with 503 errors. I do not know enough about the authentication type to know if the APIServicesAvailable error is a result of the misconfigured authentication or vice versa. because this issue block all sts cluster installation, so adding TestBlocker keywords. This is also believed to be a Regression If this is a regression it should also be marked as a blocker. Is there a reason you don't believe that to be the case? (In reply to Scott Dodson from comment #8) > If this is a regression it should also be marked as a blocker. Is there a > reason you don't believe that to be the case? This seems like a blocker to me. Some other component is creating the authentication resource prior to the CVO laying down the manifest supplied to the installer. Here are logs from the bootstrap of an install that I ran. ``` $ journalctl -u bootkube.service | grep authentication Sep 28 17:20:56 ip-10-0-4-239 bootkube.sh[2233]: Writing asset: /assets/config-bootstrap/manifests/0000_10_config-operator_01_authentication.crd.yaml Sep 28 17:21:51 ip-10-0-4-239 bootkube.sh[2233]: Created "0000_10_config-operator_01_authentication.crd.yaml" customresourcedefinitions.v1.apiextensions.k8s.io/authentications.config.openshift.io -n Sep 28 17:22:32 ip-10-0-4-239 bootkube.sh[2233]: "cluster-authentication-02-config.yaml": unable to get REST mapping for "cluster-authentication-02-config.yaml": no matches for kind "Authentication" in version "config.openshift.io/v1" Sep 28 17:22:39 ip-10-0-4-239 bootkube.sh[2233]: Skipped "cluster-authentication-02-config.yaml" authentications.v1.config.openshift.io/cluster -n as it already exists ``` The authentication CR is included as a manifest by the cluster-config-operator. https://github.com/openshift/cluster-config-operator/blob/master/empty-resources/0000_05_config-operator_02_authentication.cr.yaml I am moving this bug to kube-apiserver, as they own cluster-bootstrap. The cluster-bootstrap is failing to write the authentication resource if the cluster-version-operator writes the resource from cluster-config-operator first. This is working as designed. The cluster-config-operator creates the CRs as "create-only". It also lays down the CRDs. So there is a race between installer's use of cluster-bootstrap and cluster-config-operator. There is no mechanism in place, and it was never a requirement, that the config CR creation is replaced "by the installer". In other words, this is not a bug, but an RFE if we want to fix this. Looks like the fix is not in 4.10.0-0.nightly-2021-10-01-013103. The problem still occurs there and https://amd64.ocp.releases.ci.openshift.org/releasestream/4.10.0-0.nightly/release/4.10.0-0.nightly-2021-10-01-013103?from=4.10.0-0.nightly-2021-09-30-041351 does not list the fix in the Installer section. Moving to MODIFIED Looks like fix will be in 4.10.0-0.nightly-2021-10-01-141332. Will test there. Back to ON_QA Verified on 4.10.0-0.nightly-2021-10-01-141332 - AWS STS install successful.
{
"oauthMetadata": {
"name": ""
},
"serviceAccountIssuer": "https://xxxxxxxx-oidc.s3.us-east-2.amazonaws.com",
"type": "",
"webhookTokenAuthenticator": {
"kubeConfig": {
"name": "webhook-authentication-integrated-oauth"
}
}
}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |