Bug 2179006

Summary: [RHOSP 17.1] sensubility reports incorrect state of containers
Product: Red Hat OpenStack Reporter: Yadnesh Kulkarni <ykulkarn>
Component: openstack-tripleo-heat-templatesAssignee: Martin Magr <mmagr>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact: mgeary <mgeary>
Priority: high    
Version: 17.1 (Wallaby)CC: lmadsen, mburns, mmagr, mrunge
Target Milestone: gaKeywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-14.3.1-1.20230505003804.9fbc89a.el9ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-16 01:14:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yadnesh Kulkarni 2023-03-16 12:25:58 UTC
Description of problem:

Running containers are reported as not healthy
~~~
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-crash-ceph-0", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-0", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-1", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-2", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-3", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-4", service="default-cloud1-sens-meter"} 0
sensubility_container_health_status{container="sg-core", endpoint="prom-https", host="ceph-0.redhat.local", process="rsyslog", service="default-cloud1-sens-meter"} 0
~~~

It seems the ".State.Health.Status" field for all these containers is empty which is probably why collectd-sensubility reports it 0.
~~~
9393764e1b1b  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n client.crash.c...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-crash-ceph-0
176c0336fcf2  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n osd.0 -f --set...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-0
316afc410c52  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n osd.1 -f --set...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-1
53229e4aee3e  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n osd.2 -f --set...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-2
dbcd431f8533  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n osd.3 -f --set...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-3
203680496da7  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph@sha256:f165a015a577bc7aeebb22355f01d471722cee7873fa0f7cd23e1c210e45f30e  -n osd.4 -f --set...  47 hours ago  Up 47 hours                        ceph-4ee8ce3d-0bfe-580f-9039-a05169349d5a-osd-4
dcb9dcbca739  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-rsyslog:17.1_20230309.1                                       kolla_start           45 hours ago  Up 45 hours                        rsyslog

[root@ceph-0 collectd.d]# podman inspect rsyslog
.
.
               "Health": {
                    "Status": "",
                    "FailingStreak": 0,
                    "Log": null
               },
.
.
~~~

Version-Release number of selected component (if applicable):
collectd-sensubility-0.1.9-1.el9ost.x86_64


How reproducible:
Deploy Stf 1.5.1 with OSP 17.1

Comment 4 Martin Magr 2023-04-21 11:48:22 UTC
Wallaby backport submitted.

Comment 11 Leonid Natapov 2023-05-29 06:28:56 UTC
Stopped containers reported as 0. Also sensubility reports correct state of containers without healthcheck.

Comment 19 errata-xmlrpc 2023-08-16 01:14:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577

Comment 20 Red Hat Bugzilla 2023-12-15 04:26:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days