.Grafana certificate does not migrate during upgrade
When you upgrade from Red Hat Ceph Storage 8.1 to 9.0, the existing user-signed Grafana certificate is not migrated. Instead, Grafana switches to a cephadm-signed certificate. As a result, duplicate certificate entries may appear, and certificate-related health warnings can persist. Manual reconfiguration is required if you want to use custom TLS certificates.
Note: Data services remain unaffected.
To work without custom TLS certificates, you can continue using the cephadm-signed certificate.
As a workaround to use custom TLS certificates, complete the following steps:
1. Change the Grafana specification to use `certificate_source: reference`.
2. Use `certmgr` to upload a valid user-signed certificate and key for each host.
3. Run the `ceph orch reconfig grafana` command.
Description of problem: ------------------------ When we do upgrade from RHCS 8.1 with user-signed cert for Grafana to RHCS 9.0 (20.1.0-81) RHEL9, ceph UBI9; user-signed cert for Grafana is not migrated/lost and new cephadm-signed cert is generated. Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHCS 9.0 (20.1.0-81) RHEL9, ceph UBI9 How reproducible: ------------------------- Always Steps to Reproduce: ------------------------- [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph version ceph version 19.2.1-292.el9cp (ba02d589f9356be88303b8e8ec2790f12300f3b5) squid (stable) [root@ceph-sraut-certnew-dd1445-node1-installer ~]# [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT alertmanager ?:9093,9094 1/1 5m ago 14m count:1 ceph-exporter 3/3 5m ago 14m * grafana ?:3000 1/1 5m ago 14m count:1 mgr 2/2 5m ago 12m label:mgr mon 3/3 5m ago 10m label:mon node-exporter ?:9100 3/3 5m ago 14m * osd.all-available-devices 9 5m ago 6m * prometheus ?:9095 1/1 5m ago 14m count:1 rgw.rgw.new ?:80 3/3 5m ago 5m ceph-sraut-certnew-dd1445-node1-installer;ceph-sraut-certnew-dd1445-node2;ceph-sraut-certnew-dd1445-node3 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1094 cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr key ls grafana_key scope: host keys ceph-sraut-certnew-dd1445-node1-installer key_type: RSA key_size: 4096 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls --service-type grafana --export service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true protocol: https >> Grafana - add user signed cert [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert-key set grafana --hostname ceph-sraut-certnew-dd1445-node1-installer -i cert_key_expired.pem --force Certificate/key pair set correctly [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls --service-type grafana --export service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true protocol: https >> Before Upgrade - [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject countryName: IN stateOrProvinceName: NH localityName: BLR organizationName: IBM organizationalUnitName: IBM commonName: abc 1.2.840.113549.1.9.1: abc validity remaining_days: 0 cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr key ls grafana_key scope: host keys ceph-sraut-certnew-dd1445-node1-installer key_type: RSA key_size: 2048 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph -s cluster: id: 8feec3d0-c0a8-11f0-915c-fa163ea3d94d health: HEALTH_OK services: mon: 3 daemons, quorum ceph-sraut-certnew-dd1445-node1-installer,ceph-sraut-certnew-dd1445-node3,ceph-sraut-certnew-dd1445-node2 (age 14m) mgr: ceph-sraut-certnew-dd1445-node1-installer.igflkj(active, since 17m), standbys: ceph-sraut-certnew-dd1445-node2.psvcmq osd: 9 osds: 9 up (since 9m), 9 in (since 10m) rgw: 3 daemons active (3 hosts, 1 zones) data: pools: 5 pools, 129 pgs objects: 228 objects, 459 KiB usage: 440 MiB used, 90 GiB / 90 GiB avail pgs: 129 active+clean [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph config set mgr mgr/cephadm/certificate_check_debug_mode true [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph -s cluster: id: 8feec3d0-c0a8-11f0-915c-fa163ea3d94d health: HEALTH_WARN Detected 1 cephadm certificate(s) issues: 1 expiring services: mon: 3 daemons, quorum ceph-sraut-certnew-dd1445-node1-installer,ceph-sraut-certnew-dd1445-node3,ceph-sraut-certnew-dd1445-node2 (age 14m) mgr: ceph-sraut-certnew-dd1445-node1-installer.igflkj(active, since 17m), standbys: ceph-sraut-certnew-dd1445-node2.psvcmq osd: 9 osds: 9 up (since 9m), 9 in (since 10m) rgw: 3 daemons active (3 hosts, 1 zones) data: pools: 5 pools, 129 pgs objects: 228 objects, 459 KiB usage: 440 MiB used, 90 GiB / 90 GiB avail pgs: 129 active+clean [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert check - Certificate 'grafana_cert (ceph-sraut-certnew-dd1445-node1-installer)' (user-made) is about to expire (remaining days: 0) [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph health detail HEALTH_WARN Detected 1 cephadm certificate(s) issues: 1 expiring [WRN] CEPHADM_CERT_ERROR: Detected 1 cephadm certificate(s) issues: 1 expiring Certificate 'grafana_cert (ceph-sraut-certnew-dd1445-node1-installer)' (user-made) is about to expire (remaining days: 0) >> Proceed to upgrade - [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph osd set noout noout is set [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph osd set noscrub noscrub is set [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph osd set nodeep-scrub nodeep-scrub is set [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch upgrade check quay.io/rhceph-ci/rhceph:on-pr-v9.0-d194ae96a10191662789de56a129c87f4f77ebbc "target_digest": "quay.io/rhceph-ci/rhceph@sha256:2cc5939e635ed5d1f81aac136ce800435c99dc974fae8224c367981f817bb9b7", "target_id": "cd23043530ac27f3531156db62b42d7343e23fa3f5ecf08990ae64237cc5a2d5", "target_name": "quay.io/rhceph-ci/rhceph:on-pr-v9.0-d194ae96a10191662789de56a129c87f4f77ebbc", "target_version": "ceph version 20.1.0-81.el9cp (92ba451cb028aab09081ff9ade0cdb2b1dbb836d) tentacle (rc)", "up_to_date": [] } [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch upgrade start quay.io/rhceph-ci/rhceph:on-pr-v9.0-d194ae96a10191662789de56a129c87f4f77ebbc Initiating upgrade to quay.io/rhceph-ci/rhceph:on-pr-v9.0-d194ae96a10191662789de56a129c87f4f77ebbc [root@ceph-sraut-certnew-dd1445-node1-installer ~]# date; ceph orch upgrade status Thu Nov 13 16:14:53 UTC 2025 { "in_progress": true, "target_image": "quay.io/rhceph-ci/rhceph@sha256:2cc5939e635ed5d1f81aac136ce800435c99dc974fae8224c367981f817bb9b7", "services_complete": [], "which": "Upgrading all daemon types on all hosts", "progress": "0/26 daemons upgraded", "message": "Pulling quay.io/rhceph-ci/rhceph@sha256:2cc5939e635ed5d1f81aac136ce800435c99dc974fae8224c367981f817bb9b7 image on host ceph-sraut-certnew-dd1445-node2", "is_paused": false } [root@ceph-sraut-certnew-dd1445-node1-installer ~]# date; ceph orch upgrade status Thu Nov 13 16:25:39 UTC 2025 There are no upgrades in progress currently. >> After upgrade - [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph version ceph version 20.1.0-81.el9cp (92ba451cb028aab09081ff9ade0cdb2b1dbb836d) tentacle (rc - RelWithDebInfo) [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph -s cluster: id: 8feec3d0-c0a8-11f0-915c-fa163ea3d94d health: HEALTH_WARN 1 failed cephadm daemon(s) noout,noscrub,nodeep-scrub flag(s) set services: mon: 3 daemons, quorum ceph-sraut-certnew-dd1445-node1-installer,ceph-sraut-certnew-dd1445-node3,ceph-sraut-certnew-dd1445-node2 (age 8m) [leader: ceph-sraut-certnew-dd1445-node1-installer] mgr: ceph-sraut-certnew-dd1445-node1-installer.igflkj(active, since 9m), standbys: ceph-sraut-certnew-dd1445-node2.psvcmq osd: 9 osds: 9 up (since 6m), 9 in (since 25m) flags noout,noscrub,nodeep-scrub rgw: 3 daemons active (3 hosts, 1 zones) data: pools: 5 pools, 129 pgs objects: 235 objects, 459 KiB usage: 311 MiB used, 90 GiB / 90 GiB avail pgs: 129 active+clean [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph health detail HEALTH_WARN 1 failed cephadm daemon(s); noout,noscrub,nodeep-scrub flag(s) set [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s) daemon grafana.ceph-sraut-certnew-dd1445-node1-installer on ceph-sraut-certnew-dd1445-node1-installer is in error state [WRN] OSDMAP_FLAGS: noout,noscrub,nodeep-scrub flag(s) set [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls --include-cephadm-signed cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 cephadm-signed_agent_cert scope: host certificates ceph-sraut-certnew-dd1445-node2 subject commonName: 10.0.66.59 validity remaining_days: 1824 ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1824 cephadm-signed_grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1094 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr key ls --include-cephadm-generated-keys cephadm-signed_agent_key scope: host keys ceph-sraut-certnew-dd1445-node2 key_type: RSA key_size: 4096 ceph-sraut-certnew-dd1445-node1-installer key_type: RSA key_size: 4096 cephadm-signed_grafana_key scope: host keys ceph-sraut-certnew-dd1445-node1-installer key_type: RSA key_size: 4096 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert check - All certificates are valid. No issues detected. [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls --service-type grafana --export service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true certificate_source: cephadm-signed protocol: https ssl: true [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert-key set grafana --hostname ceph-sraut-certnew-dd1445-node1-installer -i cert_key_expired.pem --force Certificate/key pair set correctly [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch reconfig grafana Scheduled to reconfig grafana.ceph-sraut-certnew-dd1445-node1-installer on host 'ceph-sraut-certnew-dd1445-node1-installer' [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls --service-type grafana --export service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true certificate_source: cephadm-signed protocol: https ssl: true [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls --include-cephadm-signed grafana_ssl_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject countryName: IN stateOrProvinceName: NH localityName: BLR organizationName: IBM organizationalUnitName: IBM commonName: abc 1.2.840.113549.1.9.1: abc validity remaining_days: 0 cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 cephadm-signed_agent_cert scope: host certificates ceph-sraut-certnew-dd1445-node2 subject commonName: 10.0.66.59 validity remaining_days: 1824 ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1824 cephadm-signed_grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1094 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch ls --service-type grafana --export service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true certificate_source: cephadm-signed protocol: https ssl: true [root@ceph-sraut-certnew-dd1445-node1-installer ~]# vi new_grafana.yaml [root@ceph-sraut-certnew-dd1445-node1-installer ~]# cat new_grafana.yaml service_type: grafana service_name: grafana placement: count: 1 spec: anonymous_access: true certificate_source: reference protocol: https ssl: true [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch apply -i new_grafana.yaml Scheduled grafana update... Warning: SSL is configured with 'reference', and this service uses per-host certificates. To configure keys/certificates, run the following commands for each host daemons are deployed on: > ceph orch certmgr cert set --cert-name grafana_ssl_cert --service-name grafana --hostname <host> -i <cert-file> > ceph orch certmgr key set --key-name grafana_ssl_cert --service-name grafana --hostname <host> -i <key-file> Once all certificates are provisioned, run: > ceph orch reconfig grafana to reconfigure the service with the certificates. [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls --include-cephadm-signed --filter-by="grafana" grafana_ssl_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject countryName: IN stateOrProvinceName: NH localityName: BLR organizationName: IBM organizationalUnitName: IBM commonName: abc 1.2.840.113549.1.9.1: abc validity remaining_days: 0 cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 cephadm-signed_agent_cert scope: host certificates ceph-sraut-certnew-dd1445-node2 subject commonName: 10.0.66.59 validity remaining_days: 1824 ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1824 cephadm-signed_grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1094 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch reconfig grafana Scheduled to reconfig grafana.ceph-sraut-certnew-dd1445-node1-installer on host 'ceph-sraut-certnew-dd1445-node1-installer' [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph orch certmgr cert ls --include-cephadm-signed grafana_ssl_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject countryName: IN stateOrProvinceName: NH localityName: BLR organizationName: IBM organizationalUnitName: IBM commonName: abc 1.2.840.113549.1.9.1: abc validity remaining_days: 0 cephadm_root_ca_cert scope: global certificates subject commonName: cephadm-root-8feec3d0-c0a8-11f0-915c-fa163ea3d94d validity remaining_days: 3652 cephadm-signed_agent_cert scope: host certificates ceph-sraut-certnew-dd1445-node2 subject commonName: 10.0.66.59 validity remaining_days: 1824 ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1824 cephadm-signed_grafana_cert scope: host certificates ceph-sraut-certnew-dd1445-node1-installer subject commonName: 10.0.67.118 validity remaining_days: 1094 [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph -s cluster: id: 8feec3d0-c0a8-11f0-915c-fa163ea3d94d health: HEALTH_WARN Detected 1 cephadm certificate(s) issues: 1 expiring 1 failed cephadm daemon(s) noout,noscrub,nodeep-scrub flag(s) set services: mon: 3 daemons, quorum ceph-sraut-certnew-dd1445-node1-installer,ceph-sraut-certnew-dd1445-node3,ceph-sraut-certnew-dd1445-node2 (age 34m) [leader: ceph-sraut-certnew-dd1445-node1-installer] mgr: ceph-sraut-certnew-dd1445-node1-installer.igflkj(active, since 34m), standbys: ceph-sraut-certnew-dd1445-node2.psvcmq osd: 9 osds: 9 up (since 32m), 9 in (since 51m) flags noout,noscrub,nodeep-scrub rgw: 3 daemons active (3 hosts, 1 zones) data: pools: 5 pools, 129 pgs objects: 235 objects, 459 KiB usage: 328 MiB used, 90 GiB / 90 GiB avail pgs: 129 active+clean [root@ceph-sraut-certnew-dd1445-node1-installer ~]# ceph health detail HEALTH_WARN Detected 1 cephadm certificate(s) issues: 1 expiring; 1 failed cephadm daemon(s); noout,noscrub,nodeep-scrub flag(s) set [WRN] CEPHADM_CERT_ERROR: Detected 1 cephadm certificate(s) issues: 1 expiring Certificate 'grafana_ssl_cert (ceph-sraut-certnew-dd1445-node1-installer)' (user-made) is about to expire (remaining days: 0) [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s) daemon grafana.ceph-sraut-certnew-dd1445-node1-installer on ceph-sraut-certnew-dd1445-node1-installer is in unknown state [WRN] OSDMAP_FLAGS: noout,noscrub,nodeep-scrub flag(s) set Actual results: ------------------------- Grafana user-signed cert is not migrated. Also, after we add a new cert-key and reconfig Grafana as well as add new spec with "certificate_source: reference", we keep seeing 2 certs for Grafana, user-signed as well as cephadm-signed. Expected results: ------------------------- Grafana user signed-cert migration should be successful during upgrade. And if post upgrade, a new cert-key is added and spec file is updated, the cert list should show only 1 cert, i.e. user-signed. Additional info: ----------------- More details at https://docs.google.com/document/d/1v4Uo9YcWEznWBqkznN7ndDUn24SmgttQttYOKmuSVlo/edit?tab=t.33865iuzw341#heading=h.px6268qhtri8