Bug 1978869

Summary: [GSS]Prometheus alertmanager reports msg="Error on notify" err="Post https://XXXX:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not XXXX"
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Geo Jose <gjose>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Sunil Angadi <sangadi>
Severity: medium Docs Contact: Aron Gunn <agunn>
Priority: unspecified    
Version: 4.2CC: agunn, aschoen, ceph-eng-bugs, dsavinea, epuertat, gabrioux, gmeno, mmuench, nthomas, rmandyam, sangadi, tserlin, vereddy, ykaul
Target Milestone: ---   
Target Release: 4.2z3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.62.2-1.el8cp, ceph-ansible-4.0.62.2-1.el7cp Doc Type: Bug Fix
Doc Text:
.The role `ceph-dashboard` in `ceph-ansible` enforces the common name of the self-signed certificate to `ceph-dashboard` Previously, when using the self-signed certificates generated by `ceph-ansible`, it enforced the common name (CN) to `ceph-dashboard` thereby causing applications like Prometheus to error out due to the mismatch in the hostname of the node sending certificate to the clients. With this release, `ceph-ansible` sets the CN with proper values and Prometheus works as expected.
Story Points: ---
Clone Of:
: 2020628 (view as bug list) Environment:
Last Closed: 2021-09-27 18:26:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1890121    

Description Geo Jose 2021-07-03 05:53:32 UTC
Description of problem:
 - alertmanager service produces lot of errors in the log continuously.

Version-Release number of selected component (if applicable):
 - RHCS 4.2

Steps to Reproduce:
1. Install RHCS 4.2
2. Check alertmanager logs.

Actual results:
 - alertmanager service produces lot of errors

Expected results:
 - If this an error, needs to fix. 
 - If this is not an error, these logs should not be displayed.

Comment 2 Geo Jose 2021-07-03 06:04:44 UTC
 - I tried to reproduce the issue in my test lab setup and I am getting the below error:
--- 
# docker logs -f alertmanager

[...]

level=error ts=2021-07-03T04:04:41.324701372Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="Post https://node2.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node2.sample.com"
level=error ts=2021-07-03T04:04:41.324866356Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="Post https://node3.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node3.sample.com"
level=error ts=2021-07-03T04:04:41.324692299Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="Post https://node1.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node1.sample.com"
level=error ts=2021-07-03T04:04:41.325215083Z caller=dispatch.go:177 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="Post https://node1.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node1.sample.com; Post https://node2.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node2.sample.com; Post https://node3.sample.com:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not node3.sample.com"
--- 
# docker exec -it alertmanager bash
# alertmanager --version
alertmanager, version 0.16.2 (branch: rhaos-4.1-rhel-7, revision: ee10299512a6d58618947e9f4dcfb9d57b59db81)
  build user:       root@836327b9dbaa
  build date:       20200302-19:07:23
  go version:       go1.11.13
---

Comment 27 errata-xmlrpc 2021-09-27 18:26:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 4.2 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3670