Bug 1873121
Summary: | RedHat CoreOS worker node is not listening to port 80 and 443 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | barnali <b.bhattacharyya> |
Component: | Documentation | Assignee: | Eric Ponvelle <eponvell> |
Status: | CLOSED NOTABUG | QA Contact: | Hongan Li <hongli> |
Severity: | high | Docs Contact: | Vikram Goyal <vigoyal> |
Priority: | unspecified | ||
Version: | 4.5 | CC: | aos-bugs, bbreard, hongli, imcleod, jligon, jokerman, kalexand, mfisher, nstielau |
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-04-21 03:45:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
barnali
2020-08-27 12:41:10 UTC
This looks more like a problem with the haproxy pod/container not operating correctly on the node than an OS level issue. Could you provide a must-gather for the problematic node? Moving to Routing as they may be better suited to triage this kind of problem. Target set to 4.7 while investigation is either ongoing or not yet started. Will be considered for earlier release versions when diagnosed and resolved. Node ports will only be used for HAProxy if the endpoint publishing strategy [1] is set to "HostNetwork". To better assist you, please provide the following details: 1. Output of `oc get infrastructure/cluster -o` 2. Details of the ingresscontroller resource used to run HAProxy, i.e. `oc get ingresscontroller/default -n openshift-ingress-operator -o yaml`: 3. Ingress Operator logs. [1] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L69-L86 1> oc get infrastructure/cluster NAME AGE cluster 5d21h The issue was that ingresscontroller no of replicas was defaulted to 2, so in the 3rd worker node ingress controller pod was not created so port 80 and 443 was unused. Scaled the default ingress controller with below command and it started showing healthy in HAProxy. oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge I guess it can be set during OpenShift installation by specifying replica in cluster-ingress-02-config.yml file also. If this information is mentioned in installation document(https://docs.openshift.com/container-platform/4.5/installing/installing_bare_metal/installing-bare-metal.html) similar to how the below information is provided that would be helpful. <<<Modify the <installation_directory>/manifests/cluster-scheduler-02-config.yml Kubernetes manifest file to prevent Pods from being scheduled on the control plane machines>>> 2>oc get ingresscontroller/default -n openshift-ingress-operator -o yaml apiVersion: operator.openshift.io/v1 kind: IngressController metadata: creationTimestamp: "2020-08-26T09:37:01Z" finalizers: - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller generation: 2 managedFields: - apiVersion: operator.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:replicas: {} manager: oc operation: Update time: "2020-08-27T17:10:30Z" - apiVersion: operator.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"ingresscontroller.operator.openshift.io/finalizer-ingresscontroller ": {} f:spec: {} f:status: .: {} f:availableReplicas: {} f:conditions: {} f:domain: {} f:endpointPublishingStrategy: .: {} f:type: {} f:observedGeneration: {} f:selector: {} f:tlsProfile: .: {} f:ciphers: {} f:minTLSVersion: {} manager: ingress-operator operation: Update time: "2020-08-27T17:10:50Z" name: default namespace: openshift-ingress-operator resourceVersion: "740546" selfLink: /apis/operator.openshift.io/v1/namespaces/openshift-ingress-operator /ingresscontrollers/default uid: 7779c895-9fc4-4f49-aa96-935e01367f71 spec: replicas: 3 status: availableReplicas: 3 conditions: - lastTransitionTime: "2020-08-26T09:37:02Z" reason: Valid status: "True" type: Admitted - lastTransitionTime: "2020-08-27T13:37:30Z" status: "True" type: Available - lastTransitionTime: "2020-08-27T13:37:30Z" message: The deployment has Available status condition set to True reason: DeploymentAvailable status: "False" type: DeploymentDegraded - lastTransitionTime: "2020-08-26T09:37:05Z" message: The configured endpoint publishing strategy does not include a mana ged load balancer reason: EndpointPublishingStrategyExcludesManagedLoadBalancer status: "False" type: LoadBalancerManaged - lastTransitionTime: "2020-08-26T09:37:05Z" message: No DNS zones are defined in the cluster dns config. reason: NoDNSZones status: "False" type: DNSManaged - lastTransitionTime: "2020-08-27T13:37:30Z" status: "False" type: Degraded domain: apps.dmcan.ocppoc.cluster endpointPublishingStrategy: type: HostNetwork observedGeneration: 2 selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller =default tlsProfile: ciphers: - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 - ECDHE-ECDSA-AES128-GCM-SHA256 - ECDHE-RSA-AES128-GCM-SHA256 - ECDHE-ECDSA-AES256-GCM-SHA384 - ECDHE-RSA-AES256-GCM-SHA384 - ECDHE-ECDSA-CHACHA20-POLY1305 - ECDHE-RSA-CHACHA20-POLY1305 - DHE-RSA-AES128-GCM-SHA256 - DHE-RSA-AES256-GCM-SHA384 minTLSVersion: VersionTLS12 @barnali, Thanks for the feedback. I'm reassigning the bz to docs based on feedback in https://bugzilla.redhat.com/show_bug.cgi?id=1873121#c4. Hi Eric, Actually the file cluster-ingress-02-config.yml only contains ingress base domain setting, see below $ cat cluster-ingress-02-config.yml apiVersion: config.openshift.io/v1 kind: Ingress metadata: creationTimestamp: null name: cluster spec: domain: apps.example.openshift.com status: {} but the replicas is defined in CR ingresscontroller instead of above Ingress.config.openshift.io. I believe the best practice for this operation is scaling it after the installation is complete, see below: $ oc -n openshift-ingress-operator scale ingresscontroller/default --replicas=3 (patch ingresscontroller in #Comment 4 is also fine) @barnali, what do you think? if it is acceptable please feel free to close this. Thanks This can be closed if best practice is scaling it after the installation is completed. |