Bug 2270009 - [6.1z5][Monitoring Service][Prometheus]For latest 6.1z5 builds Prometheus service deployment failing.
Summary: [6.1z5][Monitoring Service][Prometheus]For latest 6.1z5 builds Prometheus ser...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Dashboard
Version: 6.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 6.1z5
Assignee: Nizamudeen
QA Contact: Vinayak Papnoi
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2267617
TreeView+ depends on / blocked
 
Reported: 2024-03-18 07:35 UTC by Mohit Bisht
Modified: 2024-07-31 04:25 UTC (History)
7 users (show)

Fixed In Version: ceph-17.2.6-206.el9cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-01 10:20:20 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-8547 0 None None None 2024-03-18 07:38:26 UTC
Red Hat Issue Tracker RHCSDASH-1314 0 None None None 2024-03-18 07:38:30 UTC
Red Hat Product Errata RHBA-2024:1580 0 None None None 2024-04-01 10:20:24 UTC

Description Mohit Bisht 2024-03-18 07:35:49 UTC
Description of problem:
[6.1z5][Monitoring Service][Prometheus]For latest 6.1z5 builds Prometheus service deployment filing.

For latest 6.1z5 builds (17.2.6-205) prometheus service deployment failing as part of regression runs.

Spec yaml file content:
---
service_type: prometheus
service_name: prometheus
placement:
  count: 1
  hosts: ['ceph-regression-fugkar-t4lm8t-node1-installer']
---
service_type: grafana
service_name: grafana
placement:
  hosts: ['ceph-regression-fugkar-t4lm8t-node1-installer']
---
service_type: alertmanager
service_name: alertmanager
placement:
  count: 1
---
service_type: node-exporter
service_name: node-exporter
placement:
  host_pattern: '*'
---
service_type: crash
service_name: crash
placement:
  host_pattern: '*'

# cephadm -v shell  --mount /tmp:/tmp -- ceph orch apply -i /tmp/tmp_ewe7dvf.yaml
2024-03-15 12:26:04,110 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1597 - Command completed successfully
2024-03-15 12:26:04,111 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph_admin.helper.py:1020 - 0/1 prometheus up... retrying
2024-03-15 12:26:04,111 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1563 - Running command cephadm shell -- ceph orch ls --service_type prometheus --format json --refresh on 10.0.195.125 timeout 600
2024-03-15 12:26:06,172 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1597 - Command completed successfully
2024-03-15 12:26:06,173 (cephci.test_cephadm) [ERROR] - cephci.Regression.rgw.40.cephci.ceph.ceph_admin.helper.py:1037 - prometheus failed with 
['2024-03-15T06:27:16.645743Z service:prometheus [INFO] "service was created"']
2024-03-15 12:26:06,174 (cephci.test_cephadm) [ERROR] - cephci.Regression.rgw.40.cephci.tests.ceph_installer.test_cephadm.py:176 - prometheus service deployment failed!!!
Traceback (most recent call last):
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/tests/ceph_installer/test_cephadm.py", line 160, in run
    func(cfg)
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/ceph/ceph_admin/orch.py", line 198, in apply_spec
    validate_spec_services(
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/ceph/ceph_admin/helper.py", line 1069, in validate_spec_services
    raise Exception(f"{svc_name or svc_type} service deployment failed!!!")
Exception: prometheus service deployment failed!!!


Note:
For 6.1z5(17.2.6-202) previous build prometheus service deployment working as expected

Version-Release number of selected component (if applicable):
17.2.6-205

How reproducible:
Always

Steps to Reproduce:
1. Bootstrap cluster
2. Deploy monitoring services(prometheus) using spec file

Actual results:
Prometheus service deployment failing

Expected results:
Prometheus service deployment pass

Comment 13 errata-xmlrpc 2024-04-01 10:20:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:1580

Comment 14 Red Hat Bugzilla 2024-07-31 04:25:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.