Bug 1837676 - Haproxy should use readyz endpoint for kube-apiserver on Bare Metal deployments
Summary: Haproxy should use readyz endpoint for kube-apiserver on Bare Metal deployments
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Beth White
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On:
Blocks: 1840366
TreeView+ depends on / blocked
 
Reported: 2020-05-19 19:12 UTC by Sai Sindhur Malleni
Modified: 2023-09-14 06:00 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1840366 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:40:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1724 0 None closed Bug 1823950: [baremetal] Switch to /readyz for haproxy healthchecking 2021-01-27 03:05:35 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:40:22 UTC

Description Sai Sindhur Malleni 2020-05-19 19:12:16 UTC
Description of problem:

Right now the haproxy config laid down by the baremetal installer and consumed by the haproxy pods on the masters through hostpath volumes looks like
defaults
  maxconn 20000
  mode    tcp
  log     /var/run/haproxy/haproxy-log.sock local0
  option  dontlognull
  retries 3
  timeout http-request 10s
  timeout queue        1m
  timeout connect      10s
  timeout client       86400s
  timeout server       86400s
  timeout tunnel       86400s
frontend  main
  bind :::9443 v4v6
  default_backend masters
listen health_check_http_url
  bind :::50936 v4v6
  mode http
  monitor-uri /healthz
  option dontlognull
listen stats
  bind localhost:50000
  mode http
  stats enable
  stats hide-version
  stats uri /haproxy_stats
  stats refresh 30s
  stats auth Username:Password
backend masters
   option  httpchk GET /healthz HTTP/1.0
   option  log-health-checks
   balance roundrobin
   server master-0 192.168.222.10:6443 weight 1 verify none check check-ssl inter 3s fall 2 rise 3
   server master-1 192.168.222.11:6443 weight 1 verify none check check-ssl inter 3s fall 2 rise 3
   server master-2 192.168.222.12:6443 weight 1 verify none check check-ssl inter 3s fall 2 rise 3

However, we should switch the endpoint to readyz based on https://github.com/openshift/installer/blob/master/docs/dev/kube-apiserver-health-check.md


We believe this could be causing https://bugzilla.redhat.com/show_bug.cgi?id=1834914


Version-Release number of the following components:

4.5.0-0.nightly-2020-05-08-222601

How reproducible:

Steps to Reproduce:
1. Install OCP on BM using IPI
2.
3.

Comment 1 Sai Sindhur Malleni 2020-05-19 20:13:04 UTC
https://github.com/openshift/machine-config-operator/commit/022933c07a4e37bed097f1cd1fa4cd2d637decc0 fixes it for 4.5, however I wonder if we should backport this to 4.4.

Comment 2 Sai Sindhur Malleni 2020-05-19 23:09:57 UTC
https://github.com/openshift/machine-config-operator/commit/022933c07a4e37bed097f1cd1fa4cd2d637decc0 fixes it for 4.5, however I wonder if we should backport this to 4.4.

Comment 3 Honza Pokorny 2020-05-26 16:08:02 UTC
If you need this in 4.4, please clone the original bug to track the backport.

Comment 4 Sai Sindhur Malleni 2020-05-26 18:09:07 UTC
I cannot find the BZ for 4.5, just found the PR. Regardless, this should be in 4.4 I believe. Can you link me to the 4.5 bug? I don't believe this should be closed like how it was done.

Comment 5 Sai Sindhur Malleni 2020-05-26 19:26:46 UTC
Cloned this bug to 4.4, if that is what was meant. Here's the clone: https://bugzilla.redhat.com/show_bug.cgi?id=1840366

Comment 12 Victor Voronkov 2020-06-09 17:31:32 UTC
[core@master-0-0 ~]$ cat /etc/haproxy/haproxy.cfg | grep readyz
  monitor-uri /readyz
   option  httpchk GET /readyz HTTP/1.0

[kni@provisionhost-0-0 ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-06-09-050255   True        False         159m    Cluster version is 4.5.0-0.nightly-2020-06-09-050255

Comment 13 errata-xmlrpc 2020-07-13 17:40:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Comment 14 Red Hat Bugzilla 2023-09-14 06:00:30 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.