Bug 2037605 - Openshift Virtualization alert 50% of the hyperconverged-cluster-operator-metrics/hyperconverged-cluster-operator-metrics targets in openshift-cnv namespace have been unreachable for more than 15 minutes on port 8686
Summary: Openshift Virtualization alert 50% of the hyperconverged-cluster-operator-met...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Installation
Version: 4.8.3
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.10.2
Assignee: João Vilaça
QA Contact: Satyajit Bulage
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-06 04:45 UTC by Yash
Modified: 2025-10-03 11:28 UTC (History)
6 users (show)

Fixed In Version: hco-bundle-registry-container-v4.10.2-5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-14 17:42:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hyperconverged-cluster-operator pull 1711 0 None Merged Remove old HCO metrics services and endpoints when upgrading 2022-02-23 15:25:55 UTC
Github kubevirt hyperconverged-cluster-operator pull 1798 0 None Merged [release-1.6] Remove old HCO metrics services and endpoints when upgrading 2022-03-03 12:23:38 UTC
Github kubevirt hyperconverged-cluster-operator pull 1937 0 None Merged Remove label protection when removing old metrics objects 2022-05-17 12:26:30 UTC
Github kubevirt hyperconverged-cluster-operator pull 1947 0 None Merged [release-1.6] Remove label protection when removing old metrics objects 2022-05-17 12:26:30 UTC
Red Hat Product Errata RHSA-2022:5026 0 None None None 2022-06-14 17:43:15 UTC

Description Yash 2022-01-06 04:45:29 UTC
Description of problem:
In the scenario, Below alerts getting generated due to the services running on port 8686 which were later on seen even after upgrading a cluster.

~~~
alertname = TargetDown
cluster = XX
datacenter = XX
job = hyperconverged-cluster-operator-metrics
namespace = openshift-cnv
prometheus = openshift-monitoring/k8s
service = hyperconverged-cluster-operator-metrics
severity = warning
Annotations
description = 50% of the hyperconverged-cluster-operator-metrics/hyperconverged-cluster-operator-metrics targets in openshift-cnv namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.
summary = Some targets were not reachable from the monitoring server for an extended period of time.
~~~

[Hostname]$ oc get svc |grep metrics|grep hypercon
hyperconverged-cluster-operator-metrics                          ClusterIP  10.255.112.238  <none>       8383/TCP,8686/TCP  324d       <<<<<<
hyperconverged-cluster-webhook-metrics                           ClusterIP  10.255.96.250   <none>       8383/TCP,8686/TCP  324d
kubevirt-hyperconverged-operator-metrics                         ClusterIP  10.255.139.248  <none>       8383/TCP           7d        <<<<<<<<

[Hostname]$ oc get endpoints |grep operator |grep hyperconverged
hyperconverged-cluster-operator-metrics                          10.254.13.93:8383,10.254.13.93:8686                               324d  
kubevirt-hyperconverged-operator-metrics                         10.254.13.93:8383                                                 7d

Version-Release number of selected component (if applicable):
Tested in OCP 4.6 and OpenShift Virtualization 2.5
Tested in OCP 4.7 and OpenShift Virtualization 2.6
Tested in OCP 4.8 and OpenShift Virtualization 4.8

How reproducible:
Every Time after upgrading from 4.6 to 4.8

Steps to Reproduce:
1. Install OCP Virtualization 4.6  
2. Upgrade OCP 4.6 to 4.8
3.

Actual results:
OCP Virtualization services leftover seen after the upgrade.

Expected results:
There should not be any service with endpoint 8686 As the hco-operator no longer listens on port 8686 and only listens on port 8383. 


Additional info:

Comment 4 Simone Tiraboschi 2022-05-02 14:21:22 UTC
postponing to 4.10.2, not so urgent.

Comment 12 errata-xmlrpc 2022-06-14 17:42:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.2 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5026


Note You need to log in before you can comment on or make changes to this bug.