Red Hat Bugzilla – Bug 1290457
Node registration fails with Introspection timeout
Last modified: 2016-09-13 12:23:27 EDT
Description of problem:
While doing a deployment od RHELOSP, in step Register Nodes, after submitting valid data about nodes, there is an error:
Error Introspection timeout
I have not seen this behaviour before on this HW. Attaching tar of /var/log from both Satellite and RHELOSP machines. I was deploying on bare metal machines and registering two nodes at the same time.
When I look at Director, I can see the nodes are there, but have no flavor.
Version-Release number of selected component (if applicable):
Happened to me once
Steps to Reproduce:
1. Start a deployment of RHELOSP, go to the Register Nodes, fill out valid required information
2. Wait. After quite long time (~hour?) an error appears
Timeout while registering nodes
Node registration complete; able to go to the next step
This might be relevant: https://bugzilla.redhat.com/show_bug.cgi?id=1280263
Have you seen this with the latest composes (in particular the latest OOO iso compose)? We found that the swift services sometimes fail to start during installation because files haven't been created yet, which can cause node introspection to time out. We've filed a bug against OSP and have added a line to restart the swift services at the end of the fusor-underlcoud-installer run.
for TP3 RC1, the nodes registered without problems. I need to try couple of times more just to be on the safe side.
I just had a single node introspection fail, leaving the node in powered-on state, so I'm inclined to think the "booting-before-pxe-config has completed" has some merit.
Introspection worked with this Dell R630 hardware with 7.1, now it only succeeds intermittently. Here's a new error:
| last_error | Failed to change power state to 'power off'. Error: not all arguments |
| | converted during string formatting |
Bumping to urgent.
(In reply to Dan Yocum from comment #9)
> Introspection worked with this Dell R630 hardware with 7.1, now it only
> succeeds intermittently. Here's a new error:
> | last_error | Failed to change power state to 'power off'.
> Error: not all arguments |
> | | converted during string formatting
> Bumping to urgent.
This specific error is likely due to the ipmitool sending commands too fast to the iDRAC, causing it to freeze up, requiring a power cycle of the entire system.
A suggestion has been made to edit /etc/ironic/ironic.conf and add this parameter to the [ipmi] section:
Which will force 10s interval between ipmi commands. The operator should adjust this value as necessary.
Please re-test with OSP-8
Verified RHOSP Register Node works with QCI 1.2 at least once. Please reopen if this reoccurs.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.