Bug 1671055

Summary: cluster-monitoring-operator loops forever with no error or log when route in not accepted.
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: MonitoringAssignee: Frederic Branczyk <fbranczy>
Status: CLOSED DEFERRED QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: erich, minden, rbost, surbania
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-18 19:09:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1664187    

Description Ryan Howe 2019-01-30 16:30:25 UTC
Description of problem:

When deploying metrics if the route is never accepted by a router, the cluster-monitoring-operator will loop forever waiting for route to become ready, with no log showing why its not progressing creating the other objects. 


Version-Release number of selected component (if applicable):
3.11 
4.0 

How reproducible:
100%

Steps to Reproduce:
1. Configure Router sharding all router deployment  
2. No way to set a default shard label with out configuring a customer default project template. 
3. Deploy cluster-monitoring
  
Actual results:
Loop for ever preforming a GET on the grafana route. 

 round_trippers.go:383] GET https://172.22.0.1:443/apis/route.openshift.io/v1/namespaces/openshift-monitoring/routes/grafana


Expected results:
At the very least log an error
 Then either continue with loop or move forward creating the rest of the metrics objects, while still checking for route to become ready logging more verbosely what its waiting on.  


Additional info:

https://github.com/openshift/cluster-monitoring-operator/blob/master/pkg/client/client.go#L497-L502

https://github.com/openshift/cluster-monitoring-operator/blob/release-3.11/pkg/client/client.go#L377-L381

Comment 1 Robert Bost 2019-01-30 22:00:41 UTC
A similar bug was filed here: bz1666936 (copied to https://jira.coreos.com/browse/RFE-13)

Comment 2 Robert Bost 2019-01-30 22:10:04 UTC
(In reply to Robert Bost from comment #1)
> A similar bug was filed here: bz1666936 (copied to
> https://jira.coreos.com/browse/RFE-13)

Since the bug mentioned above was created to request improvements on the errors, can this bug be used to track an improvement in the Route created by the operator? Or should some other method be used?

Comment 3 minden 2019-01-31 10:48:02 UTC
I think the two can be combined as one. I am linking this bug zilla ticket in the RFE-13 Jira ticket.

Comment 4 Frederic Branczyk 2019-02-04 15:59:17 UTC
@Ryan I'm unfamiliar with route sharding. Is it possible to always set the "default" route sharding? These routes are super low in traffic so I'd like to find a solution that involves no configuration by customers.

Comment 5 Eric Rich 2019-02-18 19:09:53 UTC
Closing this in favor of the RFE in Jira.