Bug 2099348 - alertmanager.yml is configured with wrong webhook_configs URLs
Summary: alertmanager.yml is configured with wrong webhook_configs URLs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 5.2
Assignee: Adam King
QA Contact: Sunil Angadi
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 1820257 2102272
TreeView+ depends on / blocked
 
Reported: 2022-06-20 15:50 UTC by Marian Krcmarik
Modified: 2022-08-09 17:39 UTC (History)
9 users (show)

Fixed In Version: ceph-16.2.8-60.el8cp
Doc Type: Bug Fix
Doc Text:
.`Cephadm` uses the FQDN to build the alertmanager webhook URLs Previously, `Cephadm` picked alertmanager webhook URLs based on the IP address it had stored for the hosts. This caused issues since these webhook URLs would not work for certain deployments. With this fix, `Cephadm` uses FQDNs to build the alertmanager webhook URLs, enabling webhook URLs to work for some deployment situations which were previously broken.
Clone Of:
Environment:
Last Closed: 2022-08-09 17:39:10 UTC
Embargoed:
sangadi: needinfo+
sangadi: needinfo-


Attachments (Terms of Use)
cephadm_command.log (264.90 KB, text/plain)
2022-06-20 15:50 UTC, Marian Krcmarik
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-4584 0 None None None 2022-06-20 15:59:38 UTC
Red Hat Product Errata RHSA-2022:5997 0 None None None 2022-08-09 17:39:56 UTC

Description Marian Krcmarik 2022-06-20 15:50:24 UTC
Created attachment 1891347 [details]
cephadm_command.log

Description of problem:
The problem was observed on Openstack 17 deployment which uses cephadm for configuring the ceph. One of the options is to configure ceph-dashboard for the ceph cluster. The alertmanager seems to be generated by cephadm with wrong configuration of webhook_configs URLs. The following snip of configuration is what is being in alertmanager.yml:
receivers:
- name: 'default'
  webhook_configs:
- name: 'ceph-dashboard'
  webhook_configs:
  - url: 'https://172.23.1.55:8444//api/prometheus_receiver'
  - url: 'https://192.168.24.19:8444/api/prometheus_receiver'
  - url: 'https://192.168.24.44:8444/api/prometheus_receiver'

There should be three endpoints since ceph cluster has three mgr nodes but the URLs are configured with a mixture of IP addresses. The first one (172.23.1.55) is correctly configured and is from "public_network" (storage network in the Openstack terminology) range and ceph-mgr listens on the IP address. The other two addresses 192.168.24.19 and 192.168.24.44 are the IP address of the other two nodes but from a different network (ctlplane in Openstack sense) and there is nothing listening on port 8444.

The config imo should look like:
- name: 'ceph-dashboard'
  webhook_configs:
  - url: 'https://172.23.1.55:8444//api/prometheus_receiver'
  - url: 'https://172.23.1.124:8444/api/prometheus_receiver'
  - url: 'https://172.23.1.243:8444/api/prometheus_receiver'

The deployment is TLS based so ideally the FQDN should be used and not IPs but since the hostname SSL verification is disabled (https://github.com/ceph/ceph/pull/45860) it does not matter that much

[root@central-controller-1 /]# ceph config dump                       
WHO                                             MASK  LEVEL     OPTION                                                 VALUE                                                                                                                                RO
global                                                advanced  cluster_network                                        172.18.1.0/24,172.18.3.0/24,172.18.2.0/24                                                                                            * 
global                                                basic     container_image                                        site-undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:90e4316d65f4a76fea307705d9b0e4706f05e10a63bf041dbee379c8711db115  * 
global                                                advanced  ms_bind_ipv4                                           true                                                                                                                                   
global                                                advanced  ms_bind_ipv6                                           false                                                                                                                                  
global                                                advanced  osd_pool_default_pg_num                                32                                                                                                                                     
global                                                advanced  osd_pool_default_pgp_num                               32                                                                                                                                     
global                                                advanced  osd_pool_default_size                                  3                                                                                                                                      
global                                                advanced  public_network                                         172.23.3.0/24,172.23.1.0/24,172.23.2.0/24                                                                                            * 
global                                                advanced  rgw_keystone_accepted_admin_roles                      ResellerAdmin, swiftoperator                                                                                                         * 
global                                                advanced  rgw_keystone_accepted_roles                            member, Member, admin                                                                                                                * 
global                                                advanced  rgw_keystone_admin_domain                              default                                                                                                                              * 
global                                                advanced  rgw_keystone_admin_password                            wXarbW5czGbiPhQwQkRpQ03ZH                                                                                                            * 
global                                                advanced  rgw_keystone_admin_project                             service                                                                                                                              * 
global                                                advanced  rgw_keystone_admin_user                                swift                                                                                                                                * 
global                                                advanced  rgw_keystone_api_version                               3                                                                                                                                      
global                                                advanced  rgw_keystone_implicit_tenants                          true                                                                                                                                 * 
global                                                basic     rgw_keystone_url                                       https://overcloud.internalapi.redhat.local:5000                                                                                      * 
global                                                advanced  rgw_max_attr_name_len                                  128                                                                                                                                    
global                                                advanced  rgw_max_attr_size                                      256                                                                                                                                    
global                                                advanced  rgw_max_attrs_num_in_req                               90                                                                                                                                     
global                                                advanced  rgw_s3_auth_use_keystone                               true                                                                                                                                   
global                                                advanced  rgw_swift_account_in_url                               true                                                                                                                                   
global                                                advanced  rgw_swift_enforce_content_length                       true                                                                                                                                   
global                                                advanced  rgw_swift_versioning_enabled                           true                                                                                                                                   
global                                                advanced  rgw_trust_forwarded_https                              true                                                                                                                                   
  mon                                                 advanced  auth_allow_insecure_global_id_reclaim                  false                                                                                                                                  
  mon                                                 advanced  public_network                                         172.23.1.0/24                                                                                                                        * 
  mgr                                                 advanced  mgr/cephadm/autotune_memory_target_ratio               0.200000                                                                                                                             * 
  mgr                                                 advanced  mgr/cephadm/container_image_alertmanager               site-undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-alertmanager:v4.6                                      * 
  mgr                                                 advanced  mgr/cephadm/container_image_base                       site-undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph                                                                            
  mgr                                                 advanced  mgr/cephadm/container_image_grafana                    site-undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-dashboard-rhel7:3                                                       * 
  mgr                                                 advanced  mgr/cephadm/container_image_node_exporter              site-undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-node-exporter:v4.6                                     * 
  mgr                                                 advanced  mgr/cephadm/container_image_prometheus                 site-undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus:v4.6                                                   * 
  mgr                                                 advanced  mgr/cephadm/container_init                             True                                                                                                                                 * 
  mgr                                                 advanced  mgr/cephadm/migration_current                          5                                                                                                                                    * 
  mgr                                                 advanced  mgr/cephadm/yes_i_know                                 true                                                                                                                                 * 
  mgr                                                 advanced  mgr/dashboard/ALERTMANAGER_API_HOST                    http://192.168.24.44:9093                                                                                                            * 
  mgr                                                 advanced  mgr/dashboard/GRAFANA_API_PASSWORD                     7Z4CoitHPngmXUIs8zPdJFDjr                                                                                                            * 
  mgr                                                 advanced  mgr/dashboard/GRAFANA_API_SSL_VERIFY                   false                                                                                                                                * 
  mgr                                                 advanced  mgr/dashboard/GRAFANA_API_URL                          https://192.168.24.71:3100                                                                                                           * 
  mgr                                                 advanced  mgr/dashboard/GRAFANA_API_USERNAME                     admin                                                                                                                                * 
  mgr                                                 advanced  mgr/dashboard/PROMETHEUS_API_HOST                      http://192.168.24.44:9092                                                                                                            * 
  mgr                                                 advanced  mgr/dashboard/central-controller-0.ymchoy/server_addr  172.23.1.55                                                                                                                          * 
  mgr                                                 advanced  mgr/dashboard/central-controller-1.folekp/server_addr  172.23.1.124                                                                                                                         * 
  mgr                                                 advanced  mgr/dashboard/central-controller-2.rmvfub/server_addr  172.23.1.243                                                                                                                         * 
  mgr                                                 advanced  mgr/dashboard/server_port                              8444                                                                                                                                 * 
  mgr                                                 advanced  mgr/dashboard/ssl                                      true                                                                                                                                 * 
  mgr                                                 advanced  mgr/dashboard/ssl_server_port                          8444                                                                                                                                 * 
  mgr                                                 advanced  mgr/orchestrator/orchestrator                          cephadm                                                                                                                                
  osd                                                 advanced  osd_memory_target_autotune                             true                                                                                                                                   
  osd                                                 advanced  osd_numa_auto_affinity                                 true                                                                                                                                 * 
    client.rgw.rgw                                    advanced  rgw_realm                                              default                                                                                                                              * 
    client.rgw.rgw                                    advanced  rgw_zone                                               default                                                                                                                              * 
    client.rgw.rgw.central-controller-0.yqfboh        basic     rgw_frontends                                          beast endpoint=172.23.1.55:8080                                                                                                      * 
    client.rgw.rgw.central-controller-1.mmwccp        basic     rgw_frontends                                          beast endpoint=172.23.1.124:8080                                                                                                     * 
    client.rgw.rgw.central-controller-2.ddemwo        basic     rgw_frontends                                          beast endpoint=172.23.1.243:8080


[root@central-controller-1 /]# ceph orch host ls
HOST                   ADDR           LABELS          STATUS  
central-computehci0-0  192.168.24.51  osd                     
central-computehci0-1  192.168.24.66  osd                     
central-computehci0-2  192.168.24.53  osd                     
central-controller-0   192.168.24.86  mon mgr _admin          
central-controller-1   192.168.24.19  mon mgr _admin          
central-controller-2   192.168.24.44  mon mgr _admin


[root@central-controller-1 /]# ceph orch ps
NAME                                 HOST                   PORTS                   STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION           IMAGE ID      CONTAINER ID  
alertmanager.central-controller-0    central-controller-0   172.23.1.55:9093,9094   running (18m)    62s ago  10h    20.5M        -                    4dbd32a970e9  feb49b8e5b37  
alertmanager.central-controller-1    central-controller-1   172.23.1.124:9093,9094  running (20h)     9m ago  10h    29.0M        -                    4dbd32a970e9  6a1cacb72aaf  
alertmanager.central-controller-2    central-controller-2   172.23.1.243:9093,9094  running (20h)     9m ago  10h    27.6M        -                    4dbd32a970e9  d0d2aecb156b  
crash.central-computehci0-0          central-computehci0-0                          running (13h)     9m ago  13h    6387k        -  16.2.7-100.el8cp  9ea8ac4eae90  d3013b0535b6  
crash.central-computehci0-1          central-computehci0-1                          running (13h)     9m ago  13h    6387k        -  16.2.7-100.el8cp  9ea8ac4eae90  aead80071fad  
crash.central-computehci0-2          central-computehci0-2                          running (13h)     9m ago  13h    6387k        -  16.2.7-100.el8cp  9ea8ac4eae90  86db09b416b5  
crash.central-controller-0           central-controller-0                           running (18m)    62s ago  13h    6387k        -  16.2.7-100.el8cp  9ea8ac4eae90  92e7bf61a09f  
crash.central-controller-1           central-controller-1                           running (13h)     9m ago  13h    6395k        -  16.2.7-100.el8cp  9ea8ac4eae90  bda3fe7b326b  
crash.central-controller-2           central-controller-2                           running (13h)     9m ago  13h    6383k        -  16.2.7-100.el8cp  9ea8ac4eae90  0dbeb783db94  
grafana.central-controller-0         central-controller-0   172.23.1.55:3100        running (18m)    62s ago  10h    21.3M        -  5.2.4             e35e7f8b951b  0d488434c41b  
grafana.central-controller-1         central-controller-1   172.23.1.124:3100       running (20h)     9m ago  10h    19.2M        -  5.2.4             e35e7f8b951b  5d2b2628905f  
grafana.central-controller-2         central-controller-2   172.23.1.243:3100       running (20h)     9m ago  10h    18.9M        -  5.2.4             e35e7f8b951b  93a3197e9ffe  
mgr.central-controller-0.ymchoy      central-controller-0   *:9283                  running (18m)    62s ago  13h     395M        -  16.2.7-100.el8cp  9ea8ac4eae90  25e27f48dabc  
mgr.central-controller-1.folekp      central-controller-1                           running (13h)     9m ago  13h     400M        -  16.2.7-100.el8cp  9ea8ac4eae90  153032929d3d  
mgr.central-controller-2.rmvfub      central-controller-2                           running (13h)     9m ago  13h     462M        -  16.2.7-100.el8cp  9ea8ac4eae90  b2e0c4b79a99  
mon.central-controller-0             central-controller-0                           running (18m)    62s ago  13h     121M    2048M  16.2.7-100.el8cp  9ea8ac4eae90  acbde401bb6e  
mon.central-controller-1             central-controller-1                           running (13h)     9m ago  13h     974M    2048M  16.2.7-100.el8cp  9ea8ac4eae90  2c00ffa88971  
mon.central-controller-2             central-controller-2                           running (13h)     9m ago  13h     971M    2048M  16.2.7-100.el8cp  9ea8ac4eae90  9dbc00978ae7  
node-exporter.central-computehci0-0  central-computehci0-0  172.23.1.225:9100       running (10h)     9m ago  10h    24.2M        -                    b5108860dcfa  b1527e1cdcd4  
node-exporter.central-computehci0-1  central-computehci0-1  172.23.1.44:9100        running (10h)     9m ago  10h    23.4M        -                    b5108860dcfa  82d6b639e765  
node-exporter.central-computehci0-2  central-computehci0-2  172.23.1.134:9100       running (10h)     9m ago  10h    23.3M        -                    b5108860dcfa  ec536f3619b2  
node-exporter.central-controller-0   central-controller-0   172.23.1.55:9100        running (18m)    62s ago  10h    22.9M        -                    b5108860dcfa  1fde2f52ea3c  
node-exporter.central-controller-1   central-controller-1   172.23.1.124:9100       running (10h)     9m ago  10h    23.9M        -                    b5108860dcfa  adf90fef6e0e  
node-exporter.central-controller-2   central-controller-2   172.23.1.243:9100       running (10h)     9m ago  10h    22.5M        -                    b5108860dcfa  3d6633ba6863  
osd.0                                central-computehci0-2                          running (13h)     9m ago  13h     202M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  d3f03a158c27  
osd.1                                central-computehci0-1                          running (13h)     9m ago  13h     121M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  87077928ba18  
osd.10                               central-computehci0-1                          running (13h)     9m ago  13h     135M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  4a31550e874a  
osd.11                               central-computehci0-0                          running (13h)     9m ago  13h     168M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  f196eac4b408  
osd.12                               central-computehci0-2                          running (13h)     9m ago  13h     134M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  8f8d4c884e7a  
osd.13                               central-computehci0-1                          running (13h)     9m ago  13h     148M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  f6373c49b234  
osd.14                               central-computehci0-0                          running (13h)     9m ago  13h     161M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  ad85af6b73ba  
osd.2                                central-computehci0-0                          running (13h)     9m ago  13h     125M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  8f073e9e1ec0  
osd.3                                central-computehci0-2                          running (13h)     9m ago  13h     198M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  fa89d154d31a  
osd.4                                central-computehci0-1                          running (13h)     9m ago  13h     169M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  4eb99322361e  
osd.5                                central-computehci0-0                          running (13h)     9m ago  13h     139M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  1b6d02de2427  
osd.6                                central-computehci0-2                          running (13h)     9m ago  13h     115M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  dddf5f028efe  
osd.7                                central-computehci0-1                          running (13h)     9m ago  13h     199M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  357298628af9  
osd.8                                central-computehci0-0                          running (13h)     9m ago  13h     169M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  0492b75a9184  
osd.9                                central-computehci0-2                          running (13h)     9m ago  13h     143M    4096M  16.2.7-100.el8cp  9ea8ac4eae90  fd8cb3075a7b  
prometheus.central-controller-0      central-controller-0   172.23.1.55:9092        running (18m)    62s ago  10h     106M        -                    ff555c91d92a  84b5abca2682  
prometheus.central-controller-1      central-controller-1   172.23.1.124:9092       running (20h)     9m ago  10h     173M        -                    ff555c91d92a  6240671a95dd  
prometheus.central-controller-2      central-controller-2   172.23.1.243:9092       running (20h)     9m ago  10h     170M        -                    ff555c91d92a  ace6c7ab81b4  
rgw.rgw.central-controller-0.yqfboh  central-controller-0   172.23.1.55:8080        running (18m)    62s ago  12h    47.5M        -  16.2.7-100.el8cp  9ea8ac4eae90  977411759a05  
rgw.rgw.central-controller-1.mmwccp  central-controller-1   172.23.1.124:8080       running (12h)     9m ago  12h    61.4M        -  16.2.7-100.el8cp  9ea8ac4eae90  3e84c4787e83  
rgw.rgw.central-controller-2.ddemwo  central-controller-2   172.23.1.243:8080       running (12h)     9m ago  12h    67.2M        -  16.2.7-100.el8cp  9ea8ac4eae90  9a3b3fb8d777



Version-Release number of selected component (if applicable):
cephadm-16.2.7-121.el9cp.noarch

How reproducible:
always

Steps to Reproduce:
1. In this case, Deploy OSP17 with ceph-dashboard using cephadm

Actual results:
alertmanager config:
- name: 'ceph-dashboard'
  webhook_configs:
  - url: 'https://172.23.1.55:8444//api/prometheus_receiver'
  - url: 'https://192.168.24.19:8444/api/prometheus_receiver'
  - url: 'https://192.168.24.44:8444/api/prometheus_receiver'

Expected results:
- name: 'ceph-dashboard'
  webhook_configs:
  - url: 'https://172.23.1.55:8444//api/prometheus_receiver'
  - url: 'https://172.23.1.124:8444/api/prometheus_receiver'
  - url: 'https://172.23.1.243:8444/api/prometheus_receiver'

Additional info:

Comment 1 RHEL Program Management 2022-06-20 15:50:31 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 19 errata-xmlrpc 2022-08-09 17:39:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5997


Note You need to log in before you can comment on or make changes to this bug.