Bug 1849583
Summary: | Error: write unix @->/var/run/haproxy/haproxy-master.sock: write: broken pipe | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Michal Fojtik <mfojtik> |
Component: | Networking | Assignee: | Yossi Boaron <yboaron> |
Networking sub component: | runtime-cfg | QA Contact: | Victor Voronkov <vvoronko> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | low | ||
Priority: | low | CC: | aos-bugs, beth.white, bperkins, m.andre, pprinett, yboaron |
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
The Liveness probe of haproxy container monitors the health of HAProxy LoadBalancer.
HAProxy LoadBalancer start running only after haproxy-monitor container rendered its configuration while the Liveness probe runs as soon as the container is active.
Consequence:
Haproxy container is wrongly restarted by Kubelet.
Fix:
Update the initial time of the Liveness probe according to the time it takes for haproxy-monitor container to render the configuration.
Result:
Haproxy container not being wrongly restarted by Kubelet because of the Liveness probe
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 16:08:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michal Fojtik
2020-06-22 10:27:16 UTC
The error seems to be in the openstack-infra namespace: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.6/1274895066913050624/artifacts/e2e-openstack/pods/openshift-openstack-infra_haproxy-5yz86jiw-12e28-qn9vz-master-0_haproxy-monitor_previous.log I'm reassigning this to OpenStack team for now since routing is likely not the correct component to handle this BZ. This may affect other on-prem platforms because they're all based on the same architecture. Not sure this is actually causing the deployment to fail since the pod was eventually able to recover: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.6/1274895066913050624/artifacts/e2e-openstack/pods/openshift-openstack-infra_haproxy-5yz86jiw-12e28-qn9vz-master-0_haproxy-monitor.log For more context, the error message comes from baremetal-runtimecfg: https://github.com/openshift/baremetal-runtimecfg/blob/d8dfe19/pkg/monitor/monitor.go#L89-L95 Lowering the severity as the reported issue did not depend on this observation. Keeping in the queue as it might still be worth investigating this error. Re-assigning to the newly created runtime-cfg subcomponent. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |