Bug 2208237
| Summary: | [FFU] Controller Nodes in MAINTENANCE state after Overcloud Ctlplane System Upgrade | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Ricardo Diaz <rdiazcam> |
| Component: | rhosp-release | Assignee: | Juan Badia Payno <jbadiapa> |
| Status: | CLOSED NOTABUG | QA Contact: | Arik Chernetsky <achernet> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 17.1 (Wallaby) | CC: | ekuris, jbadiapa, jjoyce, jpretori |
| Target Milestone: | ga | Keywords: | TestOnly, Triaged, UpgradeBlocker |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-07 12:29:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ricardo Diaz
2023-05-18 10:41:46 UTC
It looks like there is no problem when unsetting maintenance for a controller: (undercloud) [stack@undercloud-0 ~]$ metalsmith list +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ | UUID | Node Name | Allocation UUID | Hostname | State | IP Addresses | +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ | 6dca4b5f-ac03-4a04-9516-cb88fc148012 | compute-0 | b897d9a0-0ef7-4a3f-9c39-71eff5b9673d | computedpdksriov-0 | ACTIVE | ctlplane=192.0.70.17 | | 32199f18-b156-4e07-9570-56f6e90eb64c | compute-1 | 94c7c4e7-8c8d-4ea9-9bbe-017f43c4d134 | computedpdksriov-1 | ACTIVE | ctlplane=192.0.70.14 | | 46c84e87-4549-4d96-beea-ababb43e2236 | controller-0 | 7ff671a4-f665-477a-b733-c9e9a827ffa1 | controller-0 | MAINTENANCE | ctlplane=192.0.70.15 | | f90624be-e487-4ad7-8a47-4649dd545c81 | controller-1 | a129feb1-52d0-41d1-9c96-d85f7e4559a2 | controller-1 | MAINTENANCE | ctlplane=192.0.70.9 | | e296f959-b31a-436f-a87d-fc5febaac5b0 | controller-2 | 623fe14e-2ffb-488e-a7ca-e1f2bc79e007 | controller-2 | MAINTENANCE | ctlplane=192.0.70.6 | +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node maintenance unset controller-0 (undercloud) [stack@undercloud-0 ~]$ metalsmith list +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ | UUID | Node Name | Allocation UUID | Hostname | State | IP Addresses | +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ | 6dca4b5f-ac03-4a04-9516-cb88fc148012 | compute-0 | b897d9a0-0ef7-4a3f-9c39-71eff5b9673d | computedpdksriov-0 | ACTIVE | ctlplane=192.0.70.17 | | 32199f18-b156-4e07-9570-56f6e90eb64c | compute-1 | 94c7c4e7-8c8d-4ea9-9bbe-017f43c4d134 | computedpdksriov-1 | ACTIVE | ctlplane=192.0.70.14 | | 46c84e87-4549-4d96-beea-ababb43e2236 | controller-0 | 7ff671a4-f665-477a-b733-c9e9a827ffa1 | controller-0 | ACTIVE | ctlplane=192.0.70.15 | | f90624be-e487-4ad7-8a47-4649dd545c81 | controller-1 | a129feb1-52d0-41d1-9c96-d85f7e4559a2 | controller-1 | MAINTENANCE | ctlplane=192.0.70.9 | | e296f959-b31a-436f-a87d-fc5febaac5b0 | controller-2 | 623fe14e-2ffb-488e-a7ca-e1f2bc79e007 | controller-2 | MAINTENANCE | ctlplane=192.0.70.6 | +--------------------------------------+--------------+--------------------------------------+--------------------+-------------+----------------------+ It looks like that after some minutes the Controller backs to MAINTENANCE state. The issue with the metalsmith with VMs is that it is simulate the ipmi with vbmc, everything is installed on rhel8.4 with virtualenv (python3.6). Once the undercloud OS is upgraded to rhel-9.2 the vbmc does not work any longer. vbmc needs to be reinstalled and restarted. This is an issue in CI automation which would need to be solved in Infrared or some other CI automation changes. The issue is not in OSP. |