Bug 2223294
| Summary: | Collectd Sensubility doesn't work on OSP17.1 and RHEL8. | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Leonid Natapov <lnatapov> |
| Component: | collectd-sensubility | Assignee: | Martin Magr <mmagr> |
| Status: | ASSIGNED --- | QA Contact: | Leonid Natapov <lnatapov> |
| Severity: | high | Docs Contact: | mgeary <mgeary> |
| Priority: | high | ||
| Version: | 17.1 (Wallaby) | CC: | gregraka, lmadsen, mmagr, mrunge, pgrist |
| Target Milestone: | z1 | Keywords: | Triaged, ZStream |
| Target Release: | 17.1 | Flags: | mmagr:
needinfo-
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
There is a known issue when performing an in-place upgrade from RHOSP 16.2 to 17.1 GA. The collection agent, `collectd-sensubility` fails to run on RHEL 8 Compute nodes.
+
Workaround: On affected nodes edit the file, `/var/lib/container-config-scripts/collectd_check_health.py`, and replace `"healthy: .State.Health.Status}"` with `"healthy: .State.Healthcheck.Status}"/` on line 26.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Colelctd Sensubility doesn't work on OSP17.1 and RHEL8. latest collectd is collectd-5.12.0-10.el8ost This scenario ma only happen after FFU in Mixed RHEL environment when compute node(s) are RHEL8. Clean OSP17.1 this scenario won't happen. The error that I get in sensubility.log --------------------------------------- \\\"/scripts/collectd_check_health.py\\\", line 91, in \\u003cmodule\\u003e\\n rc, status = fetch_container_health(o.decode())\\n File \\\"/scripts/collectd_check_health.py\\\", line 74, in fetch_container_health\\n if len(item['healthy']) \\u003e 0 and item['status'] != 'stopped':\\nTypeError: object of type 'NoneType' has no len()\\n\",\"status\":\"1\"}}}"},"startsAt":"2023-07-14T11:12:27Z"}}] [DEBUG] Requesting execution of check. [check: check-container-health] [DEBUG] Executed check script. [output: Traceback (most recent call last): File "/scripts/collectd_check_health.py", line 91, in <module> rc, status = fetch_container_health(o.decode()) File "/scripts/collectd_check_health.py", line 74, in fetch_container_health if len(item['healthy']) > 0 and item['status'] != 'stopped': TypeError: object of type 'NoneType' has no len() The problem is that healthcheck script is using podman inspect <container-name> command, which apparently changed output. Workaround: ----------- To change /var/lib/container-config-scripts/collectd_check_health.py on line 26 s/“healthy: .State.Health.Status}“/ “healthy: .State.Healthcheck.Status}“/