Bug 1464871

Summary: hawkular-openshift-agent-configmap.yaml should be changed since there is no hawkular-metrics-certificate secret in Metrics 3.6.0
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: HawkularAssignee: Matt Wringe <mwringe>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.0CC: aos-bugs, javier.ramirez, jsanda, juzhao, ksuzumur, mburke, mwringe
Target Milestone: ---   
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1510084 (view as bug list) Environment:
Last Closed: 2017-11-28 21:58:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1510084    
Attachments:
Description Flags
secrets under openshift-infra namespace none

Description Junqi Zhao 2017-06-26 07:15:41 UTC
Description of problem:
In https://github.com/openshift/origin-metrics/blob/enterprise/hawkular-openshift-agent/hawkular-openshift-agent-configmap.yaml
ca_cert_file: secret:openshift-infra/hawkular-metrics-certificate/hawkular-metrics-ca.certificate

But there is no hawkular-metrics-certificate secret in Metrics 3.6.0, could not get application level metrics.
# oc get secret hawkular-metrics-certificate -n openshift-infra
Error from server (NotFound): secrets "hawkular-metrics-certificate" not found

hawkular-openshift-agent is in CrashLoopBackOff status.
# oc get po -n default
NAME                             READY     STATUS             RESTARTS   AGE
docker-registry-3-dw9xz          1/1       Running            0          6h
hawkular-openshift-agent-rl3f5   0/1       CrashLoopBackOff   2          15m
hawkular-openshift-agent-tlh69   0/1       CrashLoopBackOff   2          15m
registry-console-1-pn85f         1/1       Running            0          6h
router-1-s0wh1                   1/1       Running            0          6h

# oc logs hawkular-openshift-agent-rl3f5 -n default
I0626 07:08:43.834605       1 hawkular-openshift-agent.go:69] Hawkular OpenShift Agent: Version: 1.2.2.Final, Commit: de35fdddc8a9f90b8094f372c6cf9576908dd780
I0626 07:08:43.834930       1 emitter_server.go:48] Agent emitter will emit metrics
I0626 07:08:43.834936       1 emitter_server.go:64] Agent emitter will NOT emit status
I0626 07:08:43.834940       1 emitter_server.go:70] Agent emitter will provide a health probe
I0626 07:08:43.834943       1 emitter_server.go:95] Agent will start the emitter endpoint at [:8080]
E0626 07:08:43.865156       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:08:43.865175       1 metrics_storage.go:216] Could not get the requested secret. This may mean the secret has not yet been created yet or the agent does not have the required permission to access the secret. Will attempt again every 5 seconds for the next 5 minutes
E0626 07:08:48.870860       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:08:53.875770       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:08:58.880059       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:09:03.884410       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:09:08.888818       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:09:13.893427       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:09:18.898125       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found
E0626 07:09:23.902304       1 metrics_storage.go:200] Error trying to get Secret named [hawkular-metrics-certificate] from namespace [openshift-infra]: err=secrets "hawkular-metrics-certificate" not found


Version-Release number of selected component (if applicable):
# openshift version
openshift v3.6.122
kubernetes v1.6.1+5115d708d7
etcd 3.2.0

Images from brew registry
metrics-hawkular-openshift-agent   v3.6                18d769922c5a        2 days ago          274 MB


How reproducible:
Always

Steps to Reproduce:
1. Deploy Metrics using ansible
2. Deploy the hawkular-openshift-agent
   1) copy files from:  https://github.com/openshift/origin-metrics/tree/enterprise/hawkular-openshift-agent
   2) Change the IMAGE parameter according to you environment in hawkular-openshift-agent.yaml and execute the following commands:
      oc process -f hawkular-openshift-agent.yaml | oc create -n default  -f -
      oc adm policy add-cluster-role-to-user hawkular-openshift-agent system:serviceaccount:default:hawkular-openshift-agent
3. Check hawkular-openshift-agent pods' log.

Actual results:
Could not get the requested secret named [hawkular-metrics-certificate] from namespace [openshift-infra]

Expected results:
Should be no error info in pod log

Additional info:

Comment 1 Junqi Zhao 2017-06-26 07:22:15 UTC
Created attachment 1291879 [details]
secrets under openshift-infra namespace

Comment 2 Matt Wringe 2017-09-29 18:13:16 UTC
For 3.6, it using ansible to install HOSA. See https://github.com/openshift/openshift-docs/pull/5419#issuecomment-333195877 for instructions.

Comment 3 Junqi Zhao 2017-09-30 00:52:01 UTC
We are using ansible to install HOSA, this issue does not happen now.

Set it to VERIFIED.

Comment 5 Matt Wringe 2017-11-06 16:42:57 UTC
I think this is a doc issue, the docs for 3.6 are the old docs that we have for 3.5, they should be updated to include what we have in the master docs:
https://docs.openshift.org/latest/install_config/cluster_metrics.html

To install HOSA in 3.6 you will need to add to your inventory file the parameter "openshift_metrics_install_hawkular_agent" and set it to true (it defaults to false).

Comment 6 Michael Burke 2017-11-16 21:14:08 UTC
FYI: This change was made to the 3.6+ docs on sept. 29 through:
https://github.com/openshift/openshift-docs/pull/5419

Comment 7 Junqi Zhao 2017-11-17 07:02:19 UTC
(In reply to Michael Burke from comment #6)
> FYI: This change was made to the 3.6+ docs on sept. 29 through:
> https://github.com/openshift/openshift-docs/pull/5419

Got it, thanks

Comment 10 errata-xmlrpc 2017-11-28 21:58:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188