Bug 1698525

Summary: SDN metrics not collected
Product: OpenShift Container Platform Reporter: Frederic Branczyk <fbranczy>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: high    
Version: 4.1.0CC: anusaxen, aos-bugs, bbennett, cdc, sjenning, spasquie, zzhao
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:47:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Frederic Branczyk 2019-04-10 14:19:12 UTC
Description of problem:

The monitoring targets configured in Prometheus are not successfully being scraped. This is causing alerts to fire and SDN metrics not being collected.

```
openshift-sdn/monitor-sdn/0 (0/6 up) 
http://10.0.133.168:9101/metrics: dial tcp 10.0.133.168:9101: connect: connection refused
```

Version-Release number of selected component (if applicable):

4.1


How reproducible:

Always


Steps to Reproduce:
1.
2.
3.

Actual results:

All monitoring-sdn targets are unsucessfully scraped.

Expected results:


All monitoring-sdn targets are sucessfully scraped.

Additional info:

Comment 1 Casey Callendrello 2019-04-10 14:27:26 UTC
Eek, that's not good.
Jacob, can you look in to this ASAP?

Comment 2 Meng Bo 2019-04-11 02:22:40 UTC
Hi Anurag,

Can you also take a look at this? I think the metrics should work well in our testing.

Comment 3 Frederic Branczyk 2019-04-16 06:59:03 UTC
*** Bug 1700074 has been marked as a duplicate of this bug. ***

Comment 4 Casey Callendrello 2019-04-17 09:57:56 UTC
https://github.com/openshift/cluster-network-operator/pull/145 filed

Comment 6 Anurag saxena 2019-04-18 18:59:33 UTC
(In reply to Meng Bo from comment #2)
> Hi Anurag,
> 
> Can you also take a look at this? I think the metrics should work well in
> our testing.

It seems broken to me on Server Version: v1.13. I believe https://github.com/openshift/cluster-network-operator/pull/145 should solve that. I can check it on a good next green build which is still blocked due to BZ 1700504

Comment 7 Anurag saxena 2019-04-18 20:22:49 UTC
Metrics seems good to me on 4.1.0-0.nightly-2019-04-18-170154 which contains the PR mentioned in comment 4. Thanks!

Comment 8 zhaozhanqi 2019-04-22 06:21:01 UTC
so this bug can be verified according to comment 7

Comment 10 errata-xmlrpc 2019-06-04 10:47:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758