Bug 1986829
Summary: | [AUTH-20] Make prometheus authenticate with a certificate while scraping the cluster's core components metrics | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Standa Laznicka <slaznick> |
Component: | apiserver-auth | Assignee: | Standa Laznicka <slaznick> |
Status: | CLOSED ERRATA | QA Contact: | Rahul Gangwar <rgangwar> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.9 | CC: | aos-bugs, kewang, liyao, mfojtik, surbania, xxia |
Target Milestone: | --- | ||
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-10-18 17:42:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Standa Laznicka
2021-07-28 12:03:48 UTC
oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-07-175228 True False 2m10s Cluster version is 4.9.0-0.nightly-2021-08-07-175228 Checked metric client certificate oc get secret -n openshift-monitoring metrics-client-certs Opaque 2 22m oc get car system:openshift:openshift-monitoring-gnqcs 30s kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-monitoring:cluster-monitoring-operator Approved,Issued Check metric client certificate again to check new cert oc get secret -n openshift-monitoring metrics-client-certs Opaque 2 2m30s Gather prometheus metrics by using curl cert for below operators: openshift-apiserver-operator openshift-kube-apiserver-operator openshift-kube-controller-manager-operator openshift-kube-storage-version-migrator-operator For e.g. oc rsh -n openshift-apiserver-operator openshift-apiserver-operator-7f7cd7d86c-5bm49 curl -k --key /tmp/tls.key --cert /tmp/tls.crt https://localhost:8443/metrics > /tmp/metrics.txt The curl commands succeed, and checked /tmp/metrics.txt files is not empty content. Checked Openssl and checked the user of cert in the CN, it is prometheus-k8s. openssl x509 -in tls.crt -noout -text|grep CN Issuer: CN=kube-csr-signer_@1628567334 Subject: CN=system:serviceaccount:openshift-monitoring:prometheus-k8s oc get pod -n openshift-kube-apiserver -l apiserver --show-labels NAME READY STATUS RESTARTS AGE LABELS kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-0 5/5 Running 0 25m apiserver=true,app=openshift-kube-apiserver,revision=5 kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-1 5/5 Running 0 32m apiserver=true,app=openshift-kube-apiserver,revision=5 kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-2 5/5 Running 0 29m apiserver=true,app=openshift-kube-apiserver,revision=5 Configured audit profile from default to WriteRequestBodies in apiserver/cluster and wait to restart kube-apiserver oc get pod -n openshift-kube-apiserver -l apiserver --show-labels NAME READY STATUS RESTARTS AGE LABELS kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-0 5/5 Running 0 95s apiserver=true,app=openshift-kube-apiserver,revision=6 kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-1 5/5 Running 0 8m18s apiserver=true,app=openshift-kube-apiserver,revision=6 kube-apiserver-ci-ln-qvmriyb-f76d1-dt7gb-master-2 5/5 Running 0 5m5s apiserver=true,app=openshift-kube-apiserver,revision=6 Check and gather audit logs after kube-apiserver restart and wait for 15mins. Login to all master and gather audit logs. oc debug node/ci-ln-qvmriyb-f76d1-dt7gb-master-2 -T -- chroot /host grep '"requestURI":"/apis/authentication.k8s.io/v1/tokenreviews"' /var/log/kube-apiserver/audit.log > /tmp/all_tokenreviews_requests.log grep '"status":{"authenticated":true,"user":{"username":"system:serviceaccount:openshift-monitoring:prometheus-k8s"' /tmp/all_tokenreviews_requests.log > /tmp/all_tokenreviews_for_serviceaccount_prometheus-k8s.log jq '.user.username' /tmp/all_tokenreviews_for_serviceaccount_prometheus-k8s.log > /tmp/all_users_that_make_traffic_to_check_token_of_serviceaccount_prometheus-k8s.log sort /tmp/all_users_that_make_traffic_to_check_token_of_serviceaccount_prometheus-k8s.log | uniq -c | sort -rh>/tmp/users.txt Check there are no token validation requests sent to kube-apiserver from below users and there will be no output/display. for i in kube-apiserver openshift-apiserver openshift-controller-manager kube-scheduler kubelet node-exporter kube-controller-manager etcd; do grep "$i" /tmp/users.txt;done; 1 "system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa" 4 "system:kube-scheduler" Still see tokenreview requests from some targets for the prometheus SA and filed bug https://bugzilla.redhat.com/show_bug.cgi?id=1991900 And when we bring kube-apiserver unavailable unable to gather metrics, filed bug https://bugzilla.redhat.com/show_bug.cgi?id=1990281 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |