Bug 1256477 - ironic ipmitool intermittently timing out causing API requests to process slowly
ironic ipmitool intermittently timing out causing API requests to process slowly
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud (Show other bugs)
7.0 (Kilo)
All All
high Severity high
: y1
: 7.0 (Kilo)
Assigned To: John Trowbridge
Toure Dunnon
: Triaged, Unconfirmed
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-24 12:51 EDT by Jack Waterworth
Modified: 2015-10-08 08:17 EDT (History)
9 users (show)

See Also:
Fixed In Version: instack-undercloud-2.1.2-24.el7ost
Doc Type: Bug Fix
Doc Text:
Nodes registered with an unresponsive IPMI IP address caused the sync power state periodic task to hang for a default 10 minutes. This resulted in unresponsive behavior from Ironic. This fix lowers the default IPMI retry timeout. Now unresponsive nodes report failures faster and do not hang on the sync power state periodic task.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-08 08:17:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ironic-conductor and ironic-disoverd logs (405.74 KB, application/x-gzip)
2015-08-24 16:07 EDT, John Trowbridge
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1383432 None None None Never
Gerrithub.io 244058 None None None Never

  None (edit)
Description Jack Waterworth 2015-08-24 12:51:43 EDT
Description of problem:
ironic is very slow to respond and it is deleting nodes then trying to build them again

Steps to Reproduce:
1. build 4 compute nodes
2. try to add 4 moore nodes

Actual results:
ironic is slow and is rebuilding nodes

---------------------
Aug 21 17:51:54 tpacpuidc ironic-conductor: raise exception.InstanceDeployFailure(msg)
Aug 21 17:51:54 tpacpuidc ironic-conductor: InstanceDeployFailure: Failed to notify ramdisk to reboot after bootloader installation. Error: [Errno 111] ECONNREFUSED
---------------------

Expected results:
Nodes should be added without issue
Comment 4 John Trowbridge 2015-08-24 16:07:15 EDT
Created attachment 1066619 [details]
ironic-conductor and ironic-disoverd logs
Comment 6 John Trowbridge 2015-08-25 07:01:07 EDT
I knew this seemed familiar:

https://bugs.launchpad.net/ironic/+bug/1383432

I think we should try to lower that timeout ([ipmi]  #retry_timeout=60). We should try to find the highest value that relieves the issue, as setting this too low can cause some BMCs to crash. I would try 30, 15, 10, 5.

@Jack could you have them try that and report the results?
Comment 10 Omri Hochman 2015-09-18 12:06:02 EDT
Verified : instack-undercloud-2.1.2-26.el7ost.noarch

The bug was marked SanityOnly - I checked that there were no regression found when using the proposed fix.
Comment 13 errata-xmlrpc 2015-10-08 08:17:21 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862

Note You need to log in before you can comment on or make changes to this bug.