Bug 1394716 - [Logging diagnostics]Error shows "no Pods found for DeploymentConfig 'logging-curator-ops'" after deployed with enabled ops cluster.
Summary: [Logging diagnostics]Error shows "no Pods found for DeploymentConfig 'logging...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.4.0
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Jeff Cantrill
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-14 10:02 UTC by Junqi Zhao
Modified: 2017-10-02 12:24 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: curator-ops was not in the list of dcs to investigate Consequence: Diagnostic Tool does not properly evaluate curator for an ops logging cluster Fix: Add curator-ops to the list of dcs to investigate Result: Diagnostic tool now properly evaluates curator for an ops logging cluster
Clone Of:
Environment:
Last Closed: 2017-04-12 19:16:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0884 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 22:50:07 UTC

Description Junqi Zhao 2016-11-14 10:02:24 UTC
Description of problem:
Diagnostics logging on a healthy openshift env with ops cluster enabled, a fake error about the absence of curator-ops pod was reported. This issue did not repro if logging is deployed without ops cluster.

Version-Release number of selected component (if applicable):
openshift v3.4.0.25+1f36858
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

logging images from ops registry:
ops.*/logging-deployer   3.4.0               08eaf2753130        2 days ago          764.3 MB

How reproducible:
Always

Steps to Reproduce:
1. Deployed with "enable-ops-cluster=true", make sure all the pods are running, especially  logging-curator-ops pod
# oc get pods
NAME                              READY     STATUS      RESTARTS   AGE
logging-curator-1-vbu8a           1/1       Running     1          1h
logging-curator-ops-1-usnqm       1/1       Running     0          38m
logging-deployer-07p82            0/1       Completed   0          1h
logging-es-ops-hol5nqp6-1-qik5z   1/1       Running     0          1h
logging-es-vx1qswbu-1-ca7g9       1/1       Running     0          1h
logging-fluentd-sbijb             1/1       Running     0          1h
logging-kibana-1-ynrtn            2/2       Running     0          1h
logging-kibana-ops-1-apym8        2/2       Running     0          1h

# oc get dc
NAME                      REVISION   DESIRED   CURRENT   TRIGGERED BY
logging-curator           1          1         1         
logging-curator-ops       1          1         1         
logging-es-ops-hol5nqp6   1          1         1         
logging-es-vx1qswbu       1          1         1         
logging-kibana            1          1         1         
logging-kibana-ops        1          1         1  

2. Diagnose aggregated logging.
command:oadm diagnostics AggregatedLogging.
3. Check the diagnostics results.

Actual results:
Error shows:
There were no Pods found for DeploymentConfig 'logging-curator-ops'

But logging-curator-ops pod is running and logging-curator-ops dc exists.

# oadm diagnostics AggregatedLogging
[Note] Determining if client configuration exists for client/cluster diagnostics
Info:  Successfully read a client config file at '/root/.kube/config'
Info:  Using context for cluster-admin access: 'logging/ip-xx:8443/system:admin'

[Note] Running diagnostic: AggregatedLogging
       Description: Check aggregated logging integration for proper configuration
       
Info:  Found route 'logging-kibana' matching logging URL 'kibana.xx.com' in project: 'logging'

ERROR: [AGL0095 from diagnostic
AggregatedLogging@openshift/origin/pkg/diagnostics/cluster/aggregated_logging/diagnostic.go:96]
       There were no Pods found for DeploymentConfig 'logging-curator-ops'.  Try running
       the following commands for additional information:
       
         oc describe dc logging-curator-ops -n logging
         oc get events -n logging
       
WARN:  [AGL0425 from diagnostic
AggregatedLogging@openshift/origin/pkg/diagnostics/cluster/aggregated_logging/diagnostic.go:104]
       There are some nodes that match the selector for DaemonSet 'logging-fluentd'.  
       A list of matching nodes can be discovered by running:
       
         oc get nodes -l logging-infra-fluentd=true




Expected results:
Aggregated logging diagnostics should report:
Logging system running healthy & no issue found.

Additional info:
This issue did not repro if logging is deployed without ops cluster.

Comment 1 Jeff Cantrill 2016-12-01 19:30:58 UTC
fixed in https://github.com/openshift/origin/pull/12099

Comment 2 Junqi Zhao 2016-12-14 10:26:11 UTC
Tested on origin env. This issue does not exist now.
Please merge to OSE, then we can verify and close this issue.

Image Id:
openshift/origin-logging-curator    e2acbe1e04b6
openshift/origin-logging-fluentd    0e106c37e804
openshift/origin-logging-auth-proxy    c4bb5b5d17cf
openshift/origin-logging-deployer    45e11bcdbc0a
openshift/origin-logging-elasticsearch    125e6f97435c
openshift/origin-logging-kibana    614d0c989e42

Comment 3 Troy Dawson 2017-01-20 22:57:06 UTC
This has been merged into ocp and is in OCP v3.5.0.7 or newer.

Comment 4 Junqi Zhao 2017-01-22 06:47:53 UTC
Verified on openshift v3.5.0.7 and logging 3.5.0, set enable-ops-cluster=true and deployed logging 3.5.0, error "no Pods found for DeploymentConfig 'logging-curator-ops'" don't throw out now.

oc get po
NAME                              READY     STATUS      RESTARTS   AGE
logging-curator-1-x42sd           1/1       Running     0          8m
logging-curator-ops-1-lmzc3       1/1       Running     0          8m
logging-deployer-x1vjj            0/1       Completed   0          9m
logging-es-ops-pr20rovo-1-lvfbk   1/1       Running     0          8m
logging-es-ywgs4cs7-1-40618       1/1       Running     0          8m
logging-fluentd-7bxg6             1/1       Running     0          8m
logging-kibana-1-j53p2            2/2       Running     0          8m
logging-kibana-ops-1-bgbk5        2/2       Running     0          8m

openshift version:
openshift v3.5.0.7+390ef18
kubernetes v1.5.2+43a9be4
etcd 3.1.0-rc.0

Image id:
openshift3/logging-deployer    1c7f8f5bb5cc
openshift3/logging-kibana    b5f8fe3fa247
openshift3/logging-auth-proxy    139f7943475e
openshift3/logging-fluentd    e0b004b486b4
openshift3/logging-elasticsearch    7015704dc0f8
openshift3/logging-curator    7f034fdf7702

Set it to VERIFIED and close it.

Comment 6 errata-xmlrpc 2017-04-12 19:16:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884

Comment 7 Junqi Zhao 2017-05-12 07:15:48 UTC
@Jeff,

Same issue happens on OCP 3.4.1, since this defect was fixed on OCP 3.5.0, I have one question, should we back port the fix to OCP 3.4.0 and OCP 3.4.1, although the severity of this defect is not serious.


Thanks

Comment 8 Junqi Zhao 2017-05-12 07:48:12 UTC
(In reply to Junqi Zhao from comment #7)
> @Jeff,
> 
> Same issue happens on OCP 3.4.1, since this defect was fixed on OCP 3.5.0, I
> have one question, should we back port the fix to OCP 3.4.0 and OCP 3.4.1,
> although the severity of this defect is not serious.
> 
> 
> Thanks

correct my word to:

Same issue happens on Logging 3.4.1, since this defect was fixed on Logging 3.5.0, I have one question, should we back port the fix to Logging 3.4.0 and Logging 3.4.1? although the severity of this defect is not serious.

Comment 9 Jeff Cantrill 2017-10-02 12:24:41 UTC
Given the severity of the issue and the fact we are moving into release for 3.7, we will only backport if directed by PM


Note You need to log in before you can comment on or make changes to this bug.