Description of problem: The httplog value of the http response code is always zero, like in: I1201 01:06:54.612229 19 httplog.go:94] "HTTP" verb="GET" URI="/api/v1/namespaces/openshift-monitoring/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dprometheus-k8s-rulefiles-0&resourceVersion=32579&timeout=5m10s&timeoutSeconds=310&watch=true" latency="26.42360465s" userAgent="kubelet/v1.19.2+ad738ba (linux/amd64) kubernetes/ad738ba" srcIP="10.0.202.49:25022" resp=0 I1201 00:50:09.052969 18 httplog.go:94] "HTTP" verb="GET" URI="/readyz" latency="9.85141ms" userAgent="ELB-HealthChecker/2.0" srcIP="10.0.56.74:18849" resp=0 This is a real problem because we need this value to easily prove that the LB health check gets a non-200 during termination. Version-Release number of selected component (if applicable): 4.7 for sure, probably a bug we have for a long time. How reproducible: Always. Check audit termination logs where httplog is printed independently from the verbosity. Actual results: resp=0 Expected results: resp=200, resp=503, ...
Stefan noticed that we don't set the rsp code only for /readyz and watch requests
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
A PR is open and ready for review.
The LifecycleStale keyword was removed because the bug got commented on recently. The bug assignee was notified.
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-01-17-211555 True False 57m Cluster version is 4.7.0-0.nightly-2021-01-17-211555 $ oc debug node/<master node> ... sh-4.4# cd /var/log/kube-apiserver sh-4.4# grep 'httplog.go.*readyz.*resp=500' termination.log ... I0118 06:37:30.966346 18 httplog.go:94] "HTTP" verb="GET" URI="/readyz" latency="2.843191ms" userAgent="ELB-HealthChecker/2.0" srcIP="10.0.52.66:52533" resp=500 I0118 06:37:31.287344 18 httplog.go:94] "HTTP" verb="GET" URI="/readyz" latency="3.58769ms" userAgent="ELB-HealthChecker/2.0" srcIP="10.0.188.240:18384" resp=500 I0118 06:37:31.376421 18 httplog.go:94] "HTTP" verb="GET" URI="/readyz" latency="2.861812ms" userAgent="ELB-HealthChecker/2.0" srcIP="10.0.188.240:41642" resp=500 I0118 06:37:31.692545 18 httplog.go:94] "HTTP" verb="GET" URI="/readyz" latency="3.182211ms" userAgent="ELB-HealthChecker/2.0" srcIP="10.0.146.166:50504" resp=500 ... Just like PR said, for health checks on the paths like /readyz, HTTP 500 is considered a normal response to signal a LB to stop sending traffic for that path. So move the bug VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633