Bug 2097830
Summary: | Ingress operator certificate is not trusted | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Paige Rubendall <prubenda> |
Component: | oc | Assignee: | Ross Peoples <rpeoples> |
oc sub component: | oc | QA Contact: | zhou ying <yinzhou> |
Status: | CLOSED UPSTREAM | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | asoro, bkozdemb, cfergeau, hongli, jchaloup, jesper.laursen, jiazha, jkopriva, knarra, kpiwko, maszulik, mfojtik, mmasters, prkumar, pruan, prubenda, rpeoples, talessio, thomas.marko, wlewis, yinzhou |
Version: | 4.10 | Keywords: | Regression |
Target Milestone: | --- | ||
Target Release: | 4.12.z | ||
Hardware: | Unspecified | ||
OS: | Mac OS | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-02-10 16:07:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2167409, 2167412 | ||
Bug Blocks: | 1880865 |
Description
Paige Rubendall
2022-06-16 16:22:38 UTC
after trials and errors the last known good build for Mac was https://mirror.openshift.com/pub/openshift-v4/clients/ocp-dev-preview/4.11.0-0.nightly-2022-05-18-171831/. things starts to break with the next known build https://mirror.openshift.com/pub/openshift-v4/clients/ocp-dev-preview/4.11.0-0.nightly-2022-05-20-153751/ So this narrows down the scope quiet a bit. this should be oc issue . Adding regression keyword in here because it looks like a regression after we switched to 1.18 in ART. More info on this could be found in the thread here. https://coreos.slack.com/archives/CS05TR7BK/p1655805978095309 I haven't been able to find any code changes in oc that would explain the mac-specific issue. Since this started around the time we transitioned to Go 1.18, I started looking at any issues reported upstream and found this: https://github.com/golang/go/issues/51763 I have reached out to the ART to ask if that fix has been included in the current builds: https://coreos.slack.com/archives/CB95J6R4N/p1655825129058259 Response from ART: fix is not available yet. They are working on getting a 1.18.2 point release through QE. Tracking here: https://issues.redhat.com/browse/ART-4161 From ART: OpenShift 4.11 releases are now being built with Go 1.18.2, so that should include the fix required to resolve this issue. I saw https://issues.redhat.com/browse/ART-4161 has closed , but I still could reproduce it, and the oc are still build with go1.18.1 : ec2-user@ip-10-0-7-234 ~ % ./oc login -u kubeadmin -p M6Joy-NLZxh-Ik3Sm-IqxQJ https://api.yinzhou14.qe.devcluster.openshift.com:6443 --insecure-skip-tls-verify error: x509: “ingress-operator@1657784233” certificate is not trusted ec2-user@ip-10-0-7-234 ~ % ./oc version --client -o yaml clientVersion: buildDate: "2022-07-07T21:15:37Z" compiler: gc gitCommit: f17f1aa456e00569e88a0e8206c5b360e44055dd gitTreeState: clean gitVersion: 4.11.0-202207072008.p0.gf17f1aa.assembly.stream-f17f1aa goVersion: go1.18.1 major: "" minor: "" platform: darwin/arm64 kustomizeVersion: v4.5.4 releaseClientVersion: 4.11.0-rc.2 Setting blocker ? because i think this is a blocker as with this we cannot get oc to work fine on macos and any users who are updating to the latest version will hit the issue. thanks !! I think it needs to be built with go 1.18.2, which is what ART-4161 was tracking, so I'm not sure when go 1.18.2 takes effect for builds. @ross peoples from the conversation yesterday with the art team it looks like we did have it built with go 1.18.2. But below is what luke says if the "cert not trusted" problem is due to go 1.18 dropping crypto algorithms (not confident that's the case here, but seems to be implied) then i think that was fixed in go 1.18.1 anyway. my guess would be the go version is fine and it's still a bug for some other reason (or needs a flag to re-enable legacy algorithms) Also please find the conversation in the link below. https://coreos.slack.com/archives/CJARLA942/p1657795129757189 After some more testing with a Mac x64 VM, I was able to reproduce the issue with both go1.18 and go1.18.2. Compiling oc with go1.17.7 works as expected, so this is likely a regression with go1.18 that has not yet been fixed. I'll look into enabling the legacy algorithms, as this is likely the cause now that related Mac-specific bug was addressed in go1.18.1. (In reply to Ross Peoples from comment #11) > After some more testing with a Mac x64 VM, I was able to reproduce the issue > with both go1.18 and go1.18.2. Compiling oc with go1.17.7 works as expected, > so this is likely a regression with go1.18 that has not yet been fixed. I'll > look into enabling the legacy algorithms, as this is likely the cause now > that related Mac-specific bug was addressed in go1.18.1. Awesome, thanks !! Not sure if we need to approve the blocker + flag for this as we know this is a regression. I found a workaround for this, but it's not great. I'll have a PR for review shortly. (In reply to Ross Peoples from comment #13) > I found a workaround for this, but it's not great. I'll have a PR for review > shortly. when can we expect this bug to be ON_QA ? Thanks !! It is unlikely the PR for this will get approved as is. The issue was identified upstream and the ideal solution is to wait for them to fix it: https://github.com/golang/go/issues/52010 The workaround provided by the PR is not really a suitable long-term fix. There are still ongoing discussions around next steps, and as to whether the blocker+ flag should be set. Until the issue gets properly fixed upstream, the temporary workaround is to use 4.10 oc for any cli interaction with 4.11 clusters. In case there's a need to use features which are available in 4.11 but not in 4.10, we can discuss how to handle these scenarios per case. Lowering the priority/severity to medium. RN tracker updated: https://github.com/openshift/openshift-docs/issues/43249#issuecomment-1191169097 Once resolved, RN needs to be updated one more time. I have submitted a PR upstream that will fix this issue: https://github.com/golang/go/pull/53986 This issue has also been reported as <https://issues.redhat.com/browse/OCPBUGS-1028>. If backports are needed, it will be easier to use the Jira bug to track the original fix along with the backports. Would you like us to close OCPBUGS-1028 as a duplicate, or would you prefer to close this Bugzilla report as a duplicate? Either way is fine with me. We can use this bugzilla, really whatever is easier *** Bug 2091771 has been marked as a duplicate of this bug. *** Any updates on this issue?? Jira card was marked as a duplicate of this issue but no comments have been madein a few months. https://issues.redhat.com/browse/OCPBUGS-1028 This is still effecting 4.11 and 4.12 versions which is completely blocking a mac user to login to their cluster with no workaround @yingzhou, can you please help check if we are seeing this issue as paige is saying ? If yes, can we work with arda to see how can we get this fixed ? thanks !! We were trying to get this fixed upstream, but it was not accepted. We are looking to create a patch for 4.12 that should resolve this temporarily until a proper fix is available upstream. I will just share a kind of workaround for mac users if you can get the token from the web console. https://docs.openshift.com/container-platform/4.7/cli_reference/openshift_cli/getting-started-cli.html Using the Copy login command. When it's possible to configure a context using that token. You can of cause also just use the --token in every command if you like. oc config set-cluster YourClusterName --server=https://api.change_this.com:6443 --insecure-skip-tls-verify=true oc config set-credentials YourUserName --token=<TokenFromWebConsole> oc config set-context YourClusterContext --cluster YourClusterName --user=YourUserName oc config use-context YourClusterContext oc get nodes SUCCESS! :) The following workaround worked for me: - in the web console go to the project "openshift-kube-apiserver" - go to the project's secrets (Administrator -> Workloads -> Secrets) and search for the external loadbalancer's tls secret ("external-loadbalancer-serving-certkey") - copy the content of tls.crt to a local file on your mac and open the file with the keychain application and add the certificate to the system keychain - mark the certificate ("kube-apiserver-lb-signer") as always trusted oc login will work after that. HTH Thomas Hi Thomas, I followed your workaround above(mark the certificate "kube-apiserver-lb-signer" as always trusted), but it still doesn't work. MacBook-Pro:~ jianzhang$ oc login -u testuser-49 -p xxx https://api.qe-daily-412-1128.qe.azure.devcluster.openshift.com:6443 error: x509: “ingress-operator@1669591857” certificate is not trusted MacBook-Pro:~ jianzhang$ oc --loglevel 8 login -u testuser-49 -p xxx https://api.qe-daily-412-1128.qe.azure.devcluster.openshift.com:6443 I1128 18:44:27.823401 88675 loader.go:372] Config loaded from file: /Users/jianzhang/28-kubeconfig ... I1128 18:44:29.934624 88675 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: 419f5308-da86-4e35-bb94-ec9ce57dac2b I1128 18:44:29.936339 88675 request.go:1073] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"configmaps \"motd\" is forbidden: User \"system:anonymous\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"openshift\"","reason":"Forbidden","details":{"name":"motd","kind":"configmaps"},"code":403} error: x509: “ingress-operator@1669591857” certificate is not trusted (In reply to Jian Zhang from comment #28) > MacBook-Pro:~ jianzhang$ oc login -u testuser-49 -p xxx > https://api.qe-daily-412-1128.qe.azure.devcluster.openshift.com:6443 > error: x509: “ingress-operator@1669591857” certificate is not trusted If your are using username and password for authentication my workarround does not work. You have to create an api token. There are at least to ways to get a token: 1. Login into the web console (https://console-openshift-console.apps.qe-daily-412-1128.qe.azure.devcluster.openshift.com/) and open your user menu ("Copy Login command") to open another authentication page and copy the login command with the generated token into your shell. 2. Goto https://console-openshift-console.apps.qe-daily-412-1128.qe.azure.devcluster.openshift.com/oauth/token/request authenticate and request a token the same way as above. HTH! Cheers, Thomas Hello Jian and Thomas, I cannot tell for all environment, but Thomas's solution worked for me while login with kubeadmin; I just had to also download tls.crt from secret router-ca of project openshift-ingress-operator (and same, mark it as Always Trusted with Keychain Manager). This impacts multiple tools (crc, odo, ... in addition to oc), and https://github.com/openshift/oc/pull/1207 has a fix which avoids the need for any user workaround. `oc login` not working out of the box on macOS is imo a fairly bad issue, could this get some attention? Hi, is there any ETA, when this will be fixed? It is opened for more than six months and its affecting all users on MacOS. Thank you! Josef I have just installed openshift-cli 4.12 and it seems to work now with the following command oc login -u $username --insecure-skip-tls-verify=true But it still doesn't respect the `insecure-skip-tls-verify: true` value for the cluster on `.kube/config` ❯ oc version Client Version: 4.12.0-202208031327 Kustomize Version: v4.5.4 Server Version: 4.11.18 Kubernetes Version: v1.24.6+5658434 Retried with some newer versions and still getting the same issue on 4.12 and 4.13 % oc login -u kubeadmin -p ***** error: x509: “ingress-operator@1675347338” certificate is not trusted prubenda@prubenda1-mac ~ % oc version Client Version: 4.12.0-0.nightly-2023-01-31-232828 Kustomize Version: v4.5.7 Server Version: 4.13.0-0.nightly-2023-01-31-174014 Kubernetes Version: v1.25.2+7dab57f prubenda@prubenda1-mac ~ % oc login -u kubeadmin -p ***** error: x509: “ingress-operator@1675347338” certificate is not trusted prubenda@prubenda1-mac ~ % oc version Client Version: 4.13.0-0.nightly-2023-02-02-075453 Kustomize Version: v4.5.7 Server Version: 4.13.0-0.nightly-2023-01-31-174014 Kubernetes Version: v1.25.2+7dab57f prubenda@prubenda1-mac ~ % oc login -u kubeadmin -p ***** --loglevel=10 I0202 11:06:54.341945 37263 loader.go:373] Config loaded from file: /Users/prubenda/.kube/config I0202 11:06:54.342390 37263 round_trippers.go:466] curl -v -XHEAD 'https://api.*****.qe.devcluster.openshift.com:6443/' I0202 11:06:54.381650 37263 round_trippers.go:495] HTTP Trace: DNS Lookup for api.*****.qe.devcluster.openshift.com resolved to [{18.189.179.53 } {3.128.191.8 } {3.136.229.207 }] I0202 11:06:54.406806 37263 round_trippers.go:510] HTTP Trace: Dial to tcp:18.189.179.53:6443 succeed I0202 11:06:54.462176 37263 round_trippers.go:553] HEAD https://api.*****.qe.devcluster.openshift.com:6443/ 403 Forbidden in 119 milliseconds I0202 11:06:54.462203 37263 round_trippers.go:570] HTTP Statistics: DNSLookup 39 ms Dial 25 ms TLSHandshake 26 ms ServerProcessing 28 ms Duration 119 ms I0202 11:06:54.462210 37263 round_trippers.go:577] Response Headers: I0202 11:06:54.462216 37263 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: 88dcf375-c13c-4039-ac3d-ccf96b6247dd I0202 11:06:54.462222 37263 round_trippers.go:580] Content-Length: 186 I0202 11:06:54.462228 37263 round_trippers.go:580] Date: Thu, 02 Feb 2023 16:06:54 GMT I0202 11:06:54.462233 37263 round_trippers.go:580] Audit-Id: 6c45307d-3aa3-483d-8e36-233c8e9e52f2 I0202 11:06:54.462238 37263 round_trippers.go:580] Content-Type: application/json I0202 11:06:54.462243 37263 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload I0202 11:06:54.462249 37263 round_trippers.go:580] Cache-Control: no-cache, private I0202 11:06:54.462254 37263 round_trippers.go:580] X-Content-Type-Options: nosniff I0202 11:06:54.462259 37263 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 701375f0-25d8-49fe-b216-25dde705e30e I0202 11:06:54.462299 37263 round_trippers.go:466] curl -v -XGET -H "X-Csrf-Token: 1" 'https://api.*****.qe.devcluster.openshift.com:6443/.well-known/oauth-authorization-server' I0202 11:06:54.486097 37263 round_trippers.go:553] GET https://api.*****.qe.devcluster.openshift.com:6443/.well-known/oauth-authorization-server 200 OK in 23 milliseconds I0202 11:06:54.486126 37263 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 23 ms Duration 23 ms I0202 11:06:54.486133 37263 round_trippers.go:577] Response Headers: I0202 11:06:54.486140 37263 round_trippers.go:580] Content-Type: application/json I0202 11:06:54.486146 37263 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload I0202 11:06:54.486151 37263 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 701375f0-25d8-49fe-b216-25dde705e30e I0202 11:06:54.486156 37263 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: 88dcf375-c13c-4039-ac3d-ccf96b6247dd I0202 11:06:54.486162 37263 round_trippers.go:580] Content-Length: 654 I0202 11:06:54.486167 37263 round_trippers.go:580] Date: Thu, 02 Feb 2023 16:06:54 GMT I0202 11:06:54.486172 37263 round_trippers.go:580] Audit-Id: 4549ba32-1697-4e77-b0b7-877ca53aafd4 I0202 11:06:54.486177 37263 round_trippers.go:580] Cache-Control: no-cache, private I0202 11:06:54.625143 37263 request_token.go:477] unexpected error during system roots probe: x509: “ingress-operator@1675347338” certificate is not trusted I0202 11:06:54.625729 37263 round_trippers.go:466] curl -v -XGET -H "Accept: application/json, */*" -H "User-Agent: oc/4.13.0 (darwin/amd64) kubernetes/8b3f6f9" 'https://api.*****.qe.devcluster.openshift.com:6443/api/v1/namespaces/openshift/configmaps/motd' I0202 11:06:54.650300 37263 round_trippers.go:553] GET https://api.*****.qe.devcluster.openshift.com:6443/api/v1/namespaces/openshift/configmaps/motd 403 Forbidden in 24 milliseconds I0202 11:06:54.650347 37263 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 24 ms Duration 24 ms I0202 11:06:54.650359 37263 round_trippers.go:577] Response Headers: I0202 11:06:54.650372 37263 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: 88dcf375-c13c-4039-ac3d-ccf96b6247dd I0202 11:06:54.650383 37263 round_trippers.go:580] Content-Length: 303 I0202 11:06:54.650392 37263 round_trippers.go:580] Cache-Control: no-cache, private I0202 11:06:54.650402 37263 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload I0202 11:06:54.650411 37263 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 701375f0-25d8-49fe-b216-25dde705e30e I0202 11:06:54.650421 37263 round_trippers.go:580] Date: Thu, 02 Feb 2023 16:06:54 GMT I0202 11:06:54.650437 37263 round_trippers.go:580] Audit-Id: 8d467510-9f51-41d0-b5a4-d54d8c2a1d9b I0202 11:06:54.650448 37263 round_trippers.go:580] Content-Type: application/json I0202 11:06:54.650457 37263 round_trippers.go:580] X-Content-Type-Options: nosniff I0202 11:06:54.650515 37263 request.go:1171] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"configmaps \"motd\" is forbidden: User \"system:anonymous\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"openshift\"","reason":"Forbidden","details":{"name":"motd","kind":"configmaps"},"code":403} error: x509: “ingress-operator@1675347338” certificate is not trusted This should be fixed with golang 1.20/1.19.5/1.18.10. This was tracked in https://github.com/golang/go/issues/56891 on the golang side. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |