Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2270009

Summary: [6.1z5][Monitoring Service][Prometheus]For latest 6.1z5 builds Prometheus service deployment failing.
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Mohit Bisht <mobisht>
Component: Ceph-DashboardAssignee: Nizamudeen <nia>
Status: CLOSED ERRATA QA Contact: Vinayak Papnoi <vpapnoi>
Severity: urgent Docs Contact: Akash Raj <akraj>
Priority: unspecified    
Version: 6.1CC: akraj, athakkar, ceph-eng-bugs, cephqe-warriors, nia, tserlin, vpapnoi
Target Milestone: ---Keywords: Automation, Regression, TestBlocker
Target Release: 6.1z5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-17.2.6-206.el9cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-04-01 10:20:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2267617    

Description Mohit Bisht 2024-03-18 07:35:49 UTC
Description of problem:
[6.1z5][Monitoring Service][Prometheus]For latest 6.1z5 builds Prometheus service deployment filing.

For latest 6.1z5 builds (17.2.6-205) prometheus service deployment failing as part of regression runs.

Spec yaml file content:
---
service_type: prometheus
service_name: prometheus
placement:
  count: 1
  hosts: ['ceph-regression-fugkar-t4lm8t-node1-installer']
---
service_type: grafana
service_name: grafana
placement:
  hosts: ['ceph-regression-fugkar-t4lm8t-node1-installer']
---
service_type: alertmanager
service_name: alertmanager
placement:
  count: 1
---
service_type: node-exporter
service_name: node-exporter
placement:
  host_pattern: '*'
---
service_type: crash
service_name: crash
placement:
  host_pattern: '*'

# cephadm -v shell  --mount /tmp:/tmp -- ceph orch apply -i /tmp/tmp_ewe7dvf.yaml
2024-03-15 12:26:04,110 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1597 - Command completed successfully
2024-03-15 12:26:04,111 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph_admin.helper.py:1020 - 0/1 prometheus up... retrying
2024-03-15 12:26:04,111 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1563 - Running command cephadm shell -- ceph orch ls --service_type prometheus --format json --refresh on 10.0.195.125 timeout 600
2024-03-15 12:26:06,172 (cephci.test_cephadm) [INFO] - cephci.Regression.rgw.40.cephci.ceph.ceph.py:1597 - Command completed successfully
2024-03-15 12:26:06,173 (cephci.test_cephadm) [ERROR] - cephci.Regression.rgw.40.cephci.ceph.ceph_admin.helper.py:1037 - prometheus failed with 
['2024-03-15T06:27:16.645743Z service:prometheus [INFO] "service was created"']
2024-03-15 12:26:06,174 (cephci.test_cephadm) [ERROR] - cephci.Regression.rgw.40.cephci.tests.ceph_installer.test_cephadm.py:176 - prometheus service deployment failed!!!
Traceback (most recent call last):
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/tests/ceph_installer/test_cephadm.py", line 160, in run
    func(cfg)
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/ceph/ceph_admin/orch.py", line 198, in apply_spec
    validate_spec_services(
  File "/home/jenkins/ceph-builds/17.2.6-205/Regression/rgw/40/cephci/ceph/ceph_admin/helper.py", line 1069, in validate_spec_services
    raise Exception(f"{svc_name or svc_type} service deployment failed!!!")
Exception: prometheus service deployment failed!!!


Note:
For 6.1z5(17.2.6-202) previous build prometheus service deployment working as expected

Version-Release number of selected component (if applicable):
17.2.6-205

How reproducible:
Always

Steps to Reproduce:
1. Bootstrap cluster
2. Deploy monitoring services(prometheus) using spec file

Actual results:
Prometheus service deployment failing

Expected results:
Prometheus service deployment pass

Comment 13 errata-xmlrpc 2024-04-01 10:20:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:1580

Comment 14 Red Hat Bugzilla 2024-07-31 04:25:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days