Bug 1396710

Summary: Introspection getting failed with "ironic_python_agent.ironic_api_client POST failed ('connection aborted', error(111, 'connection refused'))".
Product: Red Hat OpenStack Reporter: VIKRANT <vaggarwa>
Component: openstack-ironic-python-agentAssignee: Dmitry Tantsur <dtantsur>
Status: CLOSED NOTABUG QA Contact: Raviv Bar-Tal <rbartal>
Severity: high Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: bdubois, bfournie, dtantsur, mburns, slinaber, vaggarwa
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-01 15:59:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Actual failure screenshot none

Description VIKRANT 2016-11-19 04:31:35 UTC
Description of problem:

Introspection getting failed with "ironic_python_agent.ironic_api_client POST failed ('connection aborted', error(111, 'connection refused'))". 

Version-Release number of selected component (if applicable):
RHEL OSP 8
Overcloud node HW : lenovo System x3550 M5

# rpm -qa | grep python | grep ironic 
python-ironicclient-0.8.1-1.el7ost.noarch 
python-ironic-inspector-client-1.2.0-6.el7ost.noarch

# rpm -qa | grep openstack-ironic 
openstack-ironic-common-4.2.5-2.el7ost.noarch 
openstack-ironic-conductor-4.2.5-2.el7ost.noarch 
openstack-ironic-inspector-2.2.6-1.el7ost.noarch 
openstack-ironic-api-4.2.5-2.el7ost.noarch

How reproducible:
Most of the time for customer. 

Steps to Reproduce:
1. Start the introspection.
2. Introspection getting timedout after sometime. 
3.

Actual results:
Introspection getting failed. 

Expected results:
Introspection should get completed successfully. 

Additional info:

Introspection issue is coming for 2 two nodes, 5 other nodes with similar HW and FW version able to complete the introspection successfully. 

After many attempts once introspection was completed successfully for two problematic nodes but due to a corrupted openstack update Cu. had to re-deploy our infrastructure, and the same 2 nodes are not able to finished the introspection, while all other (6 more nodes, 5 with the same hardware and 1 different) are completed. They keep on cycling with connection refused error. I have continued to deploy the openstack without needs 2 nodes, but we will need to add them to the stack.

Comment 14 Dmitry Tantsur 2016-11-29 13:42:44 UTC
Created attachment 1225843 [details]
Actual failure screenshot

I think I've found the actual error message. We need ironic-inspector logs to investigate further.

Comment 15 Dmitry Tantsur 2016-11-29 13:44:22 UTC
Please also provide your instackenv.json (feel free to strip away passwords).

Comment 16 VIKRANT 2016-11-29 13:45:35 UTC
ironic-inspector logs present in file  : /cases/01708267/x-text/journal_gxospdir41

Comment 18 Dmitry Tantsur 2016-11-29 13:53:04 UTC
I'd prefer we don't have to parse journalctl output to get the required logs..

Anyway, please also provide instackenv.json (stripped of passwords, as requested in comment 15) and output of ironic node-list.