Bug 1947147 - [FFU OSP13 to 16.1.4] After FFU upgrade overcloud node introspection failed "an not transition from state 'uninitialized' on event 'sync''
Summary: [FFU OSP13 to 16.1.4] After FFU upgrade overcloud node introspection failed "...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic-inspector
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z8
: 16.1 (Train on RHEL 8.2)
Assignee: Julia Kreger
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 2010469
TreeView+ depends on / blocked
 
Reported: 2021-04-07 19:11 UTC by Paras Babbar
Modified: 2022-03-24 10:59 UTC (History)
5 users (show)

Fixed In Version: openstack-ironic-inspector-9.2.4-1.20211112163411.2c85396.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2010469 2010990 (view as bug list)
Environment:
Last Closed: 2022-03-24 10:59:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 785245 0 None MERGED Ignored error state cache for new requests 2021-11-12 15:38:02 UTC
OpenStack gerrit 812377 0 None MERGED Ignored error state cache for new requests 2021-11-12 15:38:06 UTC
Red Hat Issue Tracker OSP-1982 0 None None None 2021-11-12 15:57:08 UTC
Red Hat Product Errata RHBA-2022:0986 0 None None None 2022-03-24 10:59:50 UTC

Description Paras Babbar 2021-04-07 19:11:22 UTC
Description of problem:

After FFU upgrade with ironic in OC enabled the instance in overcloud failed introspection with the follwing error:


TASK [ironic-overcloud : Introspect the nodes] *********************************
changed: [titan101.lab.eng.tlv2.redhat.com] => (item=ironic-0) => {
    "ansible_loop_var": "item",
    "changed": true,
    "cmd": "source /home/stack/overcloudrc\nopenstack baremetal introspection start --wait ironic-0\n",
    "delta": "0:00:04.092840",
    "end": "2021-04-07 15:11:32.313959",
    "item": "ironic-0",
    "rc": 0,
    "start": "2021-04-07 15:11:28.221119"
}

STDOUT:

+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| UUID     | Error                                                                                                                                                                            |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ironic-0 | The PXE filter driver DnsmasqFilter, state=uninitialized: my fsm encountered an exception: Can not transition from state 'uninitialized' on event 'sync' (no defined transition) |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


STDERR:

Waiting for introspection to finish...
changed: [titan101.lab.eng.tlv2.redhat.com] => (item=ironic-1) => {
    "ansible_loop_var": "item",
    "changed": true,
    "cmd": "source /home/stack/overcloudrc\nopenstack baremetal introspection start --wait ironic-1\n",
    "delta": "0:00:03.396431",
    "end": "2021-04-07 15:11:36.302769",
    "item": "ironic-1",
    "rc": 0,
    "start": "2021-04-07 15:11:32.906338"
}

STDOUT:

+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| UUID     | Error                                                                                                                                                                            |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ironic-1 | The PXE filter driver DnsmasqFilter, state=uninitialized: my fsm encountered an exception: Can not transition from state 'uninitialized' on event 'sync' (no defined transition) |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

STDERR:

Waiting for introspection to finish...

Version-Release number of selected component (if applicable):


How reproducible:
Happens after ffu upgrade

Steps to Reproduce:
1. FFU upgrade
2. start introspection in OC instances


Actual results:
Introspection failed

Expected results:
Introspection failed

Additional info:

Comment 1 Julia Kreger 2021-04-09 15:21:00 UTC
I've got... I guess part of a fix posted to upstream gerrit for this. In essence, I'm not sure there is a good way to handle the transitory database failure upon trying, since there are so many different factors. But that failure can set us up in a state where we cannot retry. The patch I've posted upstream should allow a retry to be processed so a client doesn't interpret it as a hard failure has occurred due to the "error" state being retained on the next introspection attempt.

CC'ing dtantsur as this issue may be more observable in his work, and as such he might have ideas on the database, but I think that i may be a dangerous path to take since the process does recover anyway when the cluster is healthy again.

Comment 4 Julia Kreger 2021-06-09 13:33:15 UTC
Fix for this issue is in review upstream.

Comment 5 Julia Kreger 2021-10-04 18:05:31 UTC
Backport for this change has been proposed upstream.

Comment 14 errata-xmlrpc 2022-03-24 10:59:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0986


Note You need to log in before you can comment on or make changes to this bug.