1761930 – Diagnosticsl fails when checking curator status

Bug 1761930 - Diagnosticsl fails when checking curator status

Summary: Diagnosticsl fails when checking curator status

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.11.z
Assignee:	Jeff Cantrill
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:	groom
Duplicates (3):	1736825 1776778 1801613 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-10-15 15:51 UTC by hgomes
Modified:	2024-01-06 04:26 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-03-20 00:12:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Diagnostics output (3.01 KB, text/plain) 2019-10-15 15:51 UTC, hgomes	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift origin pull 24462	None	closed	Bug 1761930: Fix ClusterLogging curator diagnostic check	2020-08-25 11:36:29 UTC
Red Hat Knowledge Base (Solution)	4919681	None	None	None	2020-03-22 11:37:41 UTC
Red Hat Product Errata	RHBA-2020:0793	None	None	None	2020-03-20 00:12:54 UTC

Description hgomes 2019-10-15 15:51:20 UTC

Created attachment 1626027 [details]
Diagnostics output

This bug was initially created as a copy of Bug #1676720

I am copying this bug because: 

Description of problem:

The Diagnostics fails when checking the status of the logging curator. This appears to be due to the fact that the curator is now controlled with a cronjob in 3.11 and as a result pods may not be running during the time of the health check.

How reproducible: Consistently

Steps to Reproduce:

# oc adm diagnostics AggregatedLogging  on a 3.11.146 cluster.

Actual results:

Curator health check fails with error message even though no pod is currently scheduled to run.

Expected results:

Curator health check would pass based on the newer cronjob implementation.


---
[root@master01 ~]#  rpm -qa | grep openshift
atomic-openshift-node-3.11.146-1.git.0.4aab273.el7.x86_64
atomic-openshift-docker-excluder-3.11.146-1.git.0.4aab273.el7.noarch
atomic-openshift-hyperkube-3.11.146-1.git.0.4aab273.el7.x86_64
atomic-openshift-clients-3.11.146-1.git.0.4aab273.el7.x86_64
atomic-openshift-excluder-3.11.146-1.git.0.4aab273.el7.noarch
atomic-openshift-3.11.146-1.git.0.4aab273.el7.x86_64
[root@master01 ~]# rpm -qa | grep ansible
ansible-2.6.19-1.el7ae.noarch
[root@master01 ~]# 
---

Comment 1 Jeff Cantrill 2019-10-15 18:05:40 UTC

This is infact not the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1676720 as indicated by the output of the diagnostics.  Please provide a snapshot of the environment: https://github.com/openshift/origin-aggregated-logging/blob/release-3.11/hack/logging-dump.sh

Comment 4 Jeff Cantrill 2020-01-31 16:08:54 UTC

*** Bug 1776778 has been marked as a duplicate of this bug. ***

Comment 5 Jeff Cantrill 2020-01-31 17:09:17 UTC

*** Bug 1736825 has been marked as a duplicate of this bug. ***

Comment 7 Jeff Cantrill 2020-02-11 13:46:06 UTC

*** Bug 1801613 has been marked as a duplicate of this bug. ***

Comment 11 Anping Li 2020-03-12 14:16:12 UTC

diagnostics pass although the last cronjob failed.

# oc adm diagnostics AggregatedLogging
[Note] Determining if client configuration exists for client/cluster diagnostics
Info:  Successfully read a client config file at '/root/.kube/config'
Info:  Using context for cluster-admin access: 'openshift-logging/ip-172-18-13-190-ec2-internal:8443/system:admin'

[Note] Running diagnostic: AggregatedLogging
       Description: Check aggregated logging integration for proper configuration
       
Info:  Did not find a DeploymentConfig to support optional component 'mux'. If you require
       this component, please re-install or update logging and specify the appropriate
       variable to enable it.
       
Info:  Looked for 'logging-mux' among the logging services for the project but did not find it.
       This optional component may not have been specified by logging install options.
       
ERROR: [AGL0147 from diagnostic AggregatedLogging@openshift/origin/pkg/oc/cli/admin/diagnostics/diagnostics/cluster/aggregated_logging/diagnostic.go:138]
       OauthClient 'kibana-proxy' does not include a redirectURI for route 'logging-es' which is 'es.apps.0312-2ns.qe.rhcloud.com'
       
[Note] Summary of diagnostics execution (version v3.11.187):
[Note] Errors seen: 1


[root@ip-172-18-13-190 ~]# oc get pods
NAME                                          READY     STATUS      RESTARTS   AGE
logging-curator-1584021720-2mw58              0/1       Error       0          11m
logging-curator-ops-1584021600-nztbh          0/1       Completed   0          13m
logging-es-data-master-kdnkz1v2-4-s5v67       2/2       Running     0          4m
logging-es-ops-data-master-ixzlwxsq-2-qlgph   2/2       Running     0          11m
logging-fluentd-4t7ph                         1/1       Running     0          12m
logging-fluentd-6n9ss                         1/1       Running     0          12m
logging-fluentd-9pcnw                         1/1       Running     0          12m
logging-fluentd-msp7m                         1/1       Running     0          12m
logging-fluentd-tfxvw                         1/1       Running     0          12m
logging-kibana-1-kkvtr                        2/2       Running     0          2h
logging-kibana-ops-1-s4dm9                    2/2       Running     0          36m
rsyslogserver-6648c55975-vdrlt                1/1       Running     0          44m

Comment 12 Jeff Cantrill 2020-03-12 18:46:24 UTC

(In reply to Anping Li from comment #11)
> diagnostics pass although the last cronjob failed.
> 

Diagnostics really only checks the topology of your cluster logging is correct, not necessarily that everything is functional, though maybe some rudimentary checks for certs.  Curator specifically its only checking the existence of the cronjobs

Comment 13 Anping Li 2020-03-13 10:00:17 UTC

Verified v3.11.188 as comment 12

Comment 15 errata-xmlrpc 2020-03-20 00:12:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0793

Comment 16 hgomes 2020-03-22 11:39:07 UTC

I did not find anything similar, so I created and linked this Bug into KCS https://access.redhat.com/solutions/4919681 .

Comment 19 Red Hat Bugzilla 2024-01-06 04:26:53 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.