Bug 1740121
| Summary: | Install master as schedulerable node will met co authentication Err | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | ge liu <geliu> | ||||
| Component: | apiserver-auth | Assignee: | Stefan Schimanski <sttts> | ||||
| Status: | CLOSED ERRATA | QA Contact: | ge liu <geliu> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 4.2.0 | CC: | aos-bugs, eparis, mfojtik, mkhan | ||||
| Target Milestone: | --- | Keywords: | Reopened | ||||
| Target Release: | 4.2.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1740337 (view as bug list) | Environment: | |||||
| Last Closed: | 2019-10-16 06:35:39 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1740337 | ||||||
| Attachments: |
|
||||||
|
Comment 1
Mo
2019-08-12 14:31:40 UTC
Moving back to assigned as I know there are at least two distinct bugs here. Hello Mo,
I tried with 4.2.0-0.nightly-2019-08-20-213632, it seems still have this issue.
# oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication Unknown Unknown True 125m
cloud-credential 4.2.0-0.nightly-2019-08-20-213632 True False False 129m
cluster-autoscaler 4.2.0-0.nightly-2019-08-20-213632 True False False 124m
console 4.2.0-0.nightly-2019-08-20-213632 False True False 124m
dns 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
image-registry 4.2.0-0.nightly-2019-08-20-213632 True False False 123m
ingress 4.2.0-0.nightly-2019-08-20-213632 True False False 124m
insights 4.2.0-0.nightly-2019-08-20-213632 True False False 129m
kube-apiserver 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
kube-controller-manager 4.2.0-0.nightly-2019-08-20-213632 True False False 126m
kube-scheduler 4.2.0-0.nightly-2019-08-20-213632 True False False 126m
machine-api 4.2.0-0.nightly-2019-08-20-213632 True False False 129m
machine-config 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
marketplace 4.2.0-0.nightly-2019-08-20-213632 True False False 123m
monitoring 4.2.0-0.nightly-2019-08-20-213632 True False False 121m
network 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
node-tuning 4.2.0-0.nightly-2019-08-20-213632 True False False 125m
openshift-apiserver 4.2.0-0.nightly-2019-08-20-213632 True False False 125m
openshift-controller-manager 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
openshift-samples 4.2.0-0.nightly-2019-08-20-213632 True False False 123m
operator-lifecycle-manager 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-08-20-213632 True False False 128m
operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-08-20-213632 True False False 125m
service-ca 4.2.0-0.nightly-2019-08-20-213632 True False False 129m
service-catalog-apiserver 4.2.0-0.nightly-2019-08-20-213632 True False False 125m
service-catalog-controller-manager 4.2.0-0.nightly-2019-08-20-213632 True False False 125m
storage 4.2.0-0.nightly-2019-08-20-213632 True False False 124m
# oc describe co authentication
Name: authentication
Namespace:
Labels: <none>
Annotations: <none>
API Version: config.openshift.io/v1
Kind: ClusterOperator
Metadata:
Creation Timestamp: 2019-08-21T03:22:02Z
Generation: 1
Resource Version: 15984
Self Link: /apis/config.openshift.io/v1/clusteroperators/authentication
UID: d7b4f42f-c3c2-11e9-aa43-02a5eca1f9aa
Spec:
Status:
Conditions:
Last Transition Time: 2019-08-21T03:25:51Z
Message: RouteHealthDegraded: failed to GET route: EOF
Reason: RouteHealthDegradedFailedGet
Status: True
Type: Degraded
Last Transition Time: 2019-08-21T03:22:02Z
Reason: NoData
Status: Unknown
Type: Progressing
Last Transition Time: 2019-08-21T03:22:02Z
Reason: NoData
Status: Unknown
Type: Available
Last Transition Time: 2019-08-21T03:22:02Z
Reason: AsExpected
Status: True
Type: Upgradeable
Extension: <nil>
Related Objects:
Group: operator.openshift.io
Name: cluster
Resource: authentications
Group: config.openshift.io
Name: cluster
Resource: authentications
Group: config.openshift.io
Name: cluster
Resource: infrastructures
Group: config.openshift.io
Name: cluster
Resource: oauths
Group:
Name: openshift-config
Resource: namespaces
Group:
Name: openshift-config-managed
Resource: namespaces
Group:
Name: openshift-authentication
Resource: namespaces
Group:
Name: openshift-authentication-operator
Resource: namespaces
Events: <none>
And master still could be scheduled, I deploy pod on it successfully.
Moving to routing to debug why routes are not working on this cluster. Nevermind the AWS bug references — I misread the original report and missed a key point about the topology under test. The problem is that there are no instances assigned to the ELB. Looking at the cluster nodes: $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-139-134.us-east-2.compute.internal Ready master,worker 21h v1.14.0+17b784327 ip-10-0-157-164.us-east-2.compute.internal Ready master,worker 21h v1.14.0+17b784327 ip-10-0-167-50.us-east-2.compute.internal Ready master,worker 21h v1.14.0+17b784327 Notice that every worker node in the cluster is labeled as a master. In Kubernetes, master nodes are not allowed to be load balancer targets. This is a deliberate behavior upstream, not a bug. It follows that ingress controllers published by a load balancer depends on at least one non-master node on which to expose a port to connect to the load balancer. We should probably consider preventing ingress controllers from being scheduled on masters. If we did so, ingress operator would report degraded and the problem would be more visible. I think we simply don't support this topology when using cloud load balancers. I'm going to close the bug and recommend we prune the test case as unsupported. I opened https://bugzilla.redhat.com/show_bug.cgi?id=1744370 to track the scheduling and status reporting issue. Installing master as schedulerable node on Azure platform is successful, but cannot access any routes (since no virtual machine in Azure LB backend pools). $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-08-21-235427 True False 33m Cluster version is 4.2.0-0.nightly-2019-08-21-235427 $ oc get node NAME STATUS ROLES AGE VERSION hongli-az427-hwp5f-master-0 Ready master,worker 49m v1.14.0+a80442411 hongli-az427-hwp5f-master-1 Ready master,worker 49m v1.14.0+a80442411 hongli-az427-hwp5f-master-2 Ready master,worker 49m v1.14.0+a80442411 $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.nightly-2019-08-21-235427 True False False 34m cloud-credential 4.2.0-0.nightly-2019-08-21-235427 True False False 48m cluster-autoscaler 4.2.0-0.nightly-2019-08-21-235427 True False False 41m console 4.2.0-0.nightly-2019-08-21-235427 True False False 35m dns 4.2.0-0.nightly-2019-08-21-235427 True False False 48m image-registry 4.2.0-0.nightly-2019-08-21-235427 True False False 38m ingress 4.2.0-0.nightly-2019-08-21-235427 True False False 41m insights 4.2.0-0.nightly-2019-08-21-235427 True False False 48m kube-apiserver 4.2.0-0.nightly-2019-08-21-235427 True False False 45m kube-controller-manager 4.2.0-0.nightly-2019-08-21-235427 True False False 45m kube-scheduler 4.2.0-0.nightly-2019-08-21-235427 True False False 45m machine-api 4.2.0-0.nightly-2019-08-21-235427 True False False 48m machine-config 4.2.0-0.nightly-2019-08-21-235427 True False False 42m marketplace 4.2.0-0.nightly-2019-08-21-235427 True False False 41m monitoring 4.2.0-0.nightly-2019-08-21-235427 True False False 39m network 4.2.0-0.nightly-2019-08-21-235427 True False False 47m node-tuning 4.2.0-0.nightly-2019-08-21-235427 True False False 43m openshift-apiserver 4.2.0-0.nightly-2019-08-21-235427 True False False 43m openshift-controller-manager 4.2.0-0.nightly-2019-08-21-235427 True False False 46m openshift-samples 4.2.0-0.nightly-2019-08-21-235427 True False False 33m operator-lifecycle-manager 4.2.0-0.nightly-2019-08-21-235427 True False False 47m operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-08-21-235427 True False False 47m operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-08-21-235427 True False False 44m service-ca 4.2.0-0.nightly-2019-08-21-235427 True False False 48m service-catalog-apiserver 4.2.0-0.nightly-2019-08-21-235427 True False False 43m service-catalog-controller-manager 4.2.0-0.nightly-2019-08-21-235427 True False False 43m storage 4.2.0-0.nightly-2019-08-21-235427 True False False 42m $ oc get pod -n openshift-ingress -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-86f85b897b-62rs2 1/1 Running 0 34m 10.128.0.30 hongli-az427-hwp5f-master-1 <none> <none> router-default-86f85b897b-8cr9p 1/1 Running 0 34m 10.129.0.37 hongli-az427-hwp5f-master-2 <none> <none> $ oc get svc -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.180.223 13.89.142.235 80:31364/TCP,443:31093/TCP 42m router-internal-default ClusterIP 172.30.176.214 <none> 80/TCP,443/TCP,1936/TCP 42m $ curl https://console-openshift-console.apps.hongli-az427.qe.azure.devcluster.openshift.com -k -vv * Rebuilt URL to: https://console-openshift-console.apps.hongli-az427.qe.azure.devcluster.openshift.com/ * Trying 13.89.142.235... * TCP_NODELAY set (time out) Created attachment 1606863 [details]
error message when adding vm to ingress LB
When I try to update the ingress LB and add vm to the backend pools manually, the error message says: This virtual machine and IP address is already added in another Public load balancer backend pool.
So seems even cluster can be installed but ingress LB is still unavailable on Azure platform.
Regarding to comment 10, there is new bug to trace this issue, so close this one. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |