Bug 1256477 - ironic ipmitool intermittently timing out causing API requests to process slowly
Summary: ironic ipmitool intermittently timing out causing API requests to process slowly
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud
Version: 7.0 (Kilo)
Hardware: All
OS: All
high
high
Target Milestone: y1
: 7.0 (Kilo)
Assignee: John Trowbridge
QA Contact: Toure Dunnon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-24 16:51 UTC by Jack Waterworth
Modified: 2023-02-22 23:02 UTC (History)
8 users (show)

Fixed In Version: instack-undercloud-2.1.2-24.el7ost
Doc Type: Bug Fix
Doc Text:
Nodes registered with an unresponsive IPMI IP address caused the sync power state periodic task to hang for a default 10 minutes. This resulted in unresponsive behavior from Ironic. This fix lowers the default IPMI retry timeout. Now unresponsive nodes report failures faster and do not hang on the sync power state periodic task.
Clone Of:
Environment:
Last Closed: 2015-10-08 12:17:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ironic-conductor and ironic-disoverd logs (405.74 KB, application/x-gzip)
2015-08-24 20:07 UTC, John Trowbridge
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Gerrithub.io 244058 0 None None None Never
Launchpad 1383432 0 None None None Never
Red Hat Product Errata RHSA-2015:1862 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux OpenStack Platform 7 director update 2015-10-08 16:05:50 UTC

Description Jack Waterworth 2015-08-24 16:51:43 UTC
Description of problem:
ironic is very slow to respond and it is deleting nodes then trying to build them again

Steps to Reproduce:
1. build 4 compute nodes
2. try to add 4 moore nodes

Actual results:
ironic is slow and is rebuilding nodes

---------------------
Aug 21 17:51:54 tpacpuidc ironic-conductor: raise exception.InstanceDeployFailure(msg)
Aug 21 17:51:54 tpacpuidc ironic-conductor: InstanceDeployFailure: Failed to notify ramdisk to reboot after bootloader installation. Error: [Errno 111] ECONNREFUSED
---------------------

Expected results:
Nodes should be added without issue

Comment 4 John Trowbridge 2015-08-24 20:07:15 UTC
Created attachment 1066619 [details]
ironic-conductor and ironic-disoverd logs

Comment 6 John Trowbridge 2015-08-25 11:01:07 UTC
I knew this seemed familiar:

https://bugs.launchpad.net/ironic/+bug/1383432

I think we should try to lower that timeout ([ipmi]  #retry_timeout=60). We should try to find the highest value that relieves the issue, as setting this too low can cause some BMCs to crash. I would try 30, 15, 10, 5.

@Jack could you have them try that and report the results?

Comment 10 Omri Hochman 2015-09-18 16:06:02 UTC
Verified : instack-undercloud-2.1.2-26.el7ost.noarch

The bug was marked SanityOnly - I checked that there were no regression found when using the proposed fix.

Comment 13 errata-xmlrpc 2015-10-08 12:17:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862


Note You need to log in before you can comment on or make changes to this bug.