Bug 1593909
Summary: | Overcloud Nodes listed as "ERROR" after Upgrade to OSP13 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Darin Sorrentino <dsorrent> | ||||
Component: | python-tripleoclient | Assignee: | Jiri Stransky <jstransk> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Gurenko Alex <agurenko> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 13.0 (Queens) | CC: | hbrock, jpichon, jslagle, jstransk, mburns | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-06-25 15:16:15 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Thanks for the report Darin, we've hit this recently in other environments too, it's a race condition between nova-compute and ironic-conductor starting up. If nova-compute comes up before ironic-conductor is able to reply on requests, the instances backed by ironic go to ERROR. Workaround is `openstack server set --state active <server-id>`. Being tracked as bug 1590297 so i'll mark this one as duplicate. *** This bug has been marked as a duplicate of bug 1590297 *** |
Created attachment 1453600 [details] sosreport from the server showing 3 overcloud nodes in ERROR state Description of problem: Both Chris J (cjanisze) and myself hit this issue. At the completion of the upgrade to OSP13 on the Director node, all/some of the Overcloud nodes show in an ERROR State: (undercloud) [stack@ds-hf-ca-undercloud ~]$ openstack server list +--------------------------------------+------------------------+--------+-----------------------+--------------------------------+---------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------------------------+--------+-----------------------+--------------------------------+---------+ | 3cd682e6-b2c0-4505-af7a-a01786a5cfe4 | overcloud-controller-2 | ACTIVE | ctlplane=172.16.0.105 | overcloud-full_20180619T142126 | control | | afb6d2a8-0937-488b-85dd-157ac38ad6bf | overcloud-controller-0 | ACTIVE | ctlplane=172.16.0.101 | overcloud-full_20180619T142126 | control | | 1f57af8d-bdc5-41b9-a58c-b561a7cfe927 | overcloud-compute-0 | ERROR | ctlplane=172.16.0.112 | overcloud-full_20180619T142126 | compute | | 2b6f3e6c-83d0-4fe1-856e-a001be10287e | overcloud-compute-1 | ERROR | ctlplane=172.16.0.103 | overcloud-full_20180619T142126 | compute | | d3b7b0be-3a55-4a0e-a1fd-15c401b392bb | overcloud-controller-1 | ERROR | ctlplane=172.16.0.108 | overcloud-full_20180619T142126 | control | +--------------------------------------+------------------------+--------+-----------------------+--------------------------------+---------+ (undercloud) [stack@ds-hf-ca-undercloud ~]$ In my environment (above) 3 nodes are in error state while 2 remain active. Chris had all of his nodes in ERROR state. The Overcloud appears to be functional so we are going to just use nova to reset the state to active. I am attaching the an sosreport from my environment before I force the state change to active. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: