Description of problem: When CNAO gets queried for metrics over HTTPS, it responds with HTTP. Version-Release number of selected component (if applicable): How reproducible: Always (?) Steps to Reproduce: 1. Get into the CNAO pod 2. curl -vvv https://localhost:8080/metrics Actual results: It fails Expected results: We should be able to serve metrics over HTTPS. Additional info: EDITED: A more relevant reproduction scenario in comment #5.
Created PR to adapt KMP as well
Blockers only: Moving to 4.10.1
The backport would require alignment across multiple components and D/S changes. Since the bug is not critical, I'm targetting it to 4.11
I want to verify the bug, but running the scenario that appears in the bug description ends with an error: [cnv-qe-jenkins@c01-yoss-411-lw7h2-executor ~]$ oc exec -it -n openshift-cnv cluster-network-addons-operator-56649bcc7f-2m7ww -- bash Defaulted container "cluster-network-addons-operator" out of: cluster-network-addons-operator, kube-rbac-proxy bash-4.4$ bash-4.4$ curl -vvv https://localhost:8080/metrics * Trying ::1... * TCP_NODELAY set * Connected to localhost (::1) port 8080 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * TLSv1.3 (OUT), TLS handshake, Client hello (1): * error:1408F10B:SSL routines:ssl3_get_record:wrong version number * Closing connection 0 curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number bash-4.4$ @oshoval @phoracek If this scenario is not the one that should be used for reproducing the bug, please supply an accurate reproduction scenario. Thank you.
After a session with Or, I verified the bug using this scenario: 1. Get the token for prometheus metrics measurement: TOKEN=$(oc sa get-token prometheus-k8s -n openshift-monitoring) 2. Get the endpoints IP addresses associated with the CNAO prometheus metrics: $ oc get ep -A | grep "network-addons" openshift-cnv cluster-network-addons-operator-prometheus-metrics 10.128.1.107:8443,10.131.0.74:8443 18h 3. The assumption is that the 2 endpoints addresses belong to the CNAO pod and to the KMP manager pod. Verify that: $ oc get pods -n openshift-cnv -o wide | grep kube | grep mac kubemacpool-cert-manager-85fcdf95cd-swt2q 1/1 Running 0 18h 10.131.0.81 c01-yoss-411-lw7h2-worker-0-cfb6p <none> <none> kubemacpool-mac-controller-manager-66b5dc49c5-5sjkj 2/2 Running 0 18h 10.128.1.107 c01-yoss-411-lw7h2-master-1 <none> <none> $ oc get pods -n openshift-cnv -o wide | grep cluster-network-addons-operator cluster-network-addons-operator-56649bcc7f-2m7ww 2/2 Running 0 18h 10.131.0.74 c01-yoss-411-lw7h2-worker-0-cfb6p <none> <none> 4. Enter the CNAO pod: $ oc exec -it -n openshift-cnv cluster-network-addons-operator-56649bcc7f-2m7ww -- bash Defaulted container "cluster-network-addons-operator" out of: cluster-network-addons-operator, kube-rbac-proxy bash-4.4$ 5. Try connecting (using curl) securely (i.e. to the https destination) to the CNAO pod: bash-4.4$ curl https://10.131.0.74:8443/metrics --header "Authorization: Bearer $TOKEN" --insecure <The response is very long, but I see it contains a lot of metric values, and it doesn't end with an error> 6. Try connecting (using curl) securely (i.e. to the https destination) to the KMP manager pod: bash-4.4$ curl https://10.128.1.107:8443/metrics --header "Authorization: Bearer $TOKEN" --insecure <The response is very long, but I see it contains a lot of metric values, and it doesn't end with an error> 7. Extra verification: a. Verify that the curl output to the CNAO pod includes kubevirt_cnao_cr_kubemacpool_deployed: bash-4.4$ curl https://10.131.0.74:8443/metrics --header "Authorization: Bearer $TOKEN" --insecure | grep kubevirt_cnao_cr_kubemacpool_deployed % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP kubevirt_cnao_cr_kubemacpool_deployed KubeMacpool is deployed by CNAO CR # TYPE kubevirt_cnao_cr_kubemacpool_deployed gauge kubevirt_cnao_cr_kubemacpool_deployed 1 100 23383 0 23383 0 0 913k 0 --:--:-- --:--:-- --:--:-- 913k bash-4.4$ b. Verify that the curl output to the KMP pod includes grep kubevirt_kmp_duplicate_macs: bash-4.4$ curl https://10.128.1.107:8443/metrics --header "Authorization: Bearer $TOKEN" --insecure | grep kubevirt_kmp_duplicate_macs;echo % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP kubevirt_kmp_duplicate_macs Kubemacpool duplicate macs counter # TYPE kubevirt_kmp_duplicate_macs counter kubevirt_kmp_duplicate_macs 0 100 19729 0 19729 0 0 67334 0 --:--:-- --:--:-- --:--:-- 67105 bash-4.4$ Verified on CNV 4.11.0, with cluster-network-addons-operator-container-v4.11.0-20 Thank you Or for the guidance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6526