Bug 1472295

Summary: [mix-version] introspect failed from osp12 undercloud with osp 11 image with "Failed to load collector set(['numa-topology'])"
Product: Red Hat OpenStack Reporter: Raviv Bar-Tal <rbartal>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED NOTABUG QA Contact: Amit Ugol <augol>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 12.0 (Pike)CC: aschultz, bfournie, dbecker, mburns, mlammon, morazi, rbartal, rhel-osp-director-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-16 12:31:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sos report
none
ironic.conf
none
ironic-conductor.log
none
ironic-inspector.log
none
screenshot none

Description Raviv Bar-Tal 2017-07-18 11:55:32 UTC
Created attachment 1300443 [details]
sos report

Description of problem:
Introspection from ops 12 undercloud with osp11 (latest) images on fails.

Version-Release number of selected component (if applicable):
undercloud  osp12 
rhosp-director-images-ipa-11.0-20170630.1.el7ost.noarch
rhosp-director-images-11.0-20170630.1.el7ost.noarch

How reproducible:

Steps to Reproduce:
1. install osp12 undercloud
2. upload osp11 images to glance and /httpboot
3. set agent.kernel and agent.ramdisk for the nodes
4. run introspection,

Actual results:
introspection fails with time out

Expected results:
introspection pass 

Additional info:
The hypervisor os is RHEL7.4

Comment 1 Dmitry Tantsur 2017-08-02 15:47:17 UTC
Could you please check what exactly happens on the node? Is it e.g. PXE timeout?  https://docs.openstack.org/ironic-inspector/latest/user/troubleshooting.html#introspection-times-out may help

Comment 3 Bob Fournier 2017-08-14 18:45:03 UTC
Ravi - any update on this one?  Thanks.

Comment 4 Raviv Bar-Tal 2017-08-15 12:42:52 UTC
Created attachment 1313650 [details]
ironic.conf

Comment 5 Raviv Bar-Tal 2017-08-15 12:44:04 UTC
Created attachment 1313651 [details]
ironic-conductor.log

Comment 6 Raviv Bar-Tal 2017-08-15 12:44:46 UTC
Created attachment 1313652 [details]
ironic-inspector.log

Comment 7 Raviv Bar-Tal 2017-08-15 12:45:28 UTC
Created attachment 1313653 [details]
screenshot

Comment 8 Raviv Bar-Tal 2017-08-15 12:49:54 UTC
Hi Bob, Dmitry
This BZ is easy to reproduce and it is still happens.
The ramdisk log are not collected to my system, please check attached ironic.conf and advice about deploy_logs_* configuration

also attached:
ironic-conductor.log 
ironic-inspector.log
screen shot form inspected node

Thanks

Comment 9 Bob Fournier 2017-08-15 18:37:03 UTC
Thanks Ravi.

Looks like this error in IPA:
2017-08-15 07:28:11.837 27198 DEBUG ironic_inspector.main [-] [unidentified node] Received data from the ramdisk: {u'error': u"The following errors were encountered:\n* Failed to load collector set(['numa-topology'])"} api_continue /usr/lib/python2.7/site-packages/ironic_inspector/main.py:193
2017-08-15 07:28:11.837 27198 DEBUG ironic_inspector.process [-] [unidentified node] Running pre-processing hook ramdisk_error _run_pre_hooks /usr/lib/python2.7/site-packages/ironic_inspector/process.py:117
2017-08-15 07:28:11.837 27198 ERROR ironic_inspector.utils [-] [unidentified node] Ramdisk reported error: The following errors were encountered:
* Failed to load collector set(['numa-topology'])
2017-08-15 07:28:11.838 27198 ERROR ironic_inspector.process [-] [unidentified node] Hook ramdisk_error failed, delaying error report until node look up: Ramdisk reported error: The following errors were encountered:

The full stack trace can be seen in the screenshot.

Comment 10 Bob Fournier 2017-08-15 19:12:55 UTC
So what appears to be happening is this instack-undercloud change [1] adds numa-topology as a default collector in OSP-12.  This is run in the OSP-12 undercloud which installs OSP-11 images.  So the default is there in the undercloud but the installed OSP-11 IPA does not support numa-topology, i.e. doesn't have [2], so the collector can't be found.

[1] https://review.openstack.org/#/c/474120/
[2] https://review.openstack.org/#/c/424729/

Comment 11 Bob Fournier 2017-08-16 12:31:55 UTC
Closing this as this isn't a supported configuration.  The problem is with running  older IPA images with an updated undercloud.  Although mixed undercloud/overcloud configurations are supported, the IPA image is really part of the undercloud, and as such, an older IPA image will not work for introspection with a newer undercloud.