Bug 1663268 - Enabling router metrics browser page cause "Readiness probe failed: HTTP probe failed with statuscode: 401"
Summary: Enabling router metrics browser page cause "Readiness probe failed: HTTP prob...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.11.z
Assignee: Dan Mace
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-03 15:34 UTC by Alan Chan
Modified: 2022-08-04 22:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-14 02:17:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0407 0 None None None 2019-03-14 02:18:07 UTC

Description Alan Chan 2019-01-03 15:34:15 UTC
Description of problem:

Using the doc: https://docs.openshift.com/container-platform/3.11/install_config/router/default_haproxy_router.html#exposing-the-router-metrics, to enable router metrics page for browser.

Would cause the pod to fail to start up with "Readiness probe failed: HTTP probe failed with statuscode: 401".


Version-Release number of selected component (if applicable):

3.11.51


How reproducible:

- Repeatedly.


Steps to Reproduce:

1. A working cluster with router pods

2. oc set env dc router ROUTER_METRICS_TYPE- ROUTER_LISTEN_ADDR-

3. oc -n default get pod -w
  - shows pod stuck at deploy and eventually fails deploy

4. oc -n default get events
  - shows readiness probe failure


Actual results:

- router pod fails to deploy

Expected results:

- router pod deploy successful and metrics page enabled for browser at port 1936


Additional info:

Comment 1 Alan Chan 2019-01-07 20:31:05 UTC
Workaround based on what Luke found in the case 02284589 from 3.9:

- oc patch dc router -p '"spec": {"template": {"spec": {"containers": [{"name": "router","readinessProbe": {"httpGet": {"path": "healthz"}}}]}}}'

- oc set env dc router-dmz ROUTER_METRICS_TYPE-

Comment 2 Alan Chan 2019-01-07 20:33:52 UTC
(In reply to Alan C from comment #1)
> Workaround based on what Luke found in the case 02284589 from 3.9:
> 
> - oc patch dc router -p '"spec": {"template": {"spec": {"containers":
> [{"name": "router","readinessProbe": {"httpGet": {"path": "healthz"}}}]}}}'
> 
> - oc set env dc router-dmz ROUTER_METRICS_TYPE-

Sorry, typo, should be:

- oc set env dc router ROUTER_METRICS_TYPE-

...to match the patch command.

Comment 3 Ram Ranganathan 2019-02-13 00:09:06 UTC
Yeah, the issue is we changed the probe paths to be different to distinguish between liveness and readiness checks. 
But when you disable the listener on the stats port on the openshift-router and enable it on haproxy, there is only one 
"unauthenticated" endpoint available (as haproxy only allows one monitor-uri) to use.

So we do have to match up both the probes to the same endpoint "/healthz" when we remove ROUTER_LISTEN_ADDR - will create a PR for 
the docs team to also mention that we need to update the readiness probe in that section.  Thanks.

Comment 4 Ram Ranganathan 2019-02-13 01:44:34 UTC
Docs PR: https://github.com/openshift/openshift-docs/pull/13608

Comment 5 Dan Mace 2019-02-18 21:15:25 UTC
Moving to MODIFIED as the docs update merged.

Comment 7 Ram Ranganathan 2019-02-25 06:23:05 UTC
Docs PR against master: https://github.com/openshift/openshift-docs/pull/13622 

The 3.11 PR (https://github.com/openshift/openshift-docs/pull/13608) got closed by Vikram so am not certain if this is 
ready for QA unless I missed another docs PR.

Comment 8 Hongan Li 2019-02-26 08:25:13 UTC
the Doc PR looks good, thanks

Comment 10 errata-xmlrpc 2019-03-14 02:17:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0407


Note You need to log in before you can comment on or make changes to this bug.