Bug 1328585 - Introspection errors with 'The following required parameters are missing: ['local_gb']'. Passes on readding nodes and rerunning introspection.
Summary: Introspection errors with 'The following required parameters are missing: ['l...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic-python-agent
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: async
: 8.0 (Liberty)
Assignee: Dmitry Tantsur
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
Depends On:
Blocks: 1346023
TreeView+ depends on / blocked
 
Reported: 2016-04-19 18:53 UTC by Ronelle Landy
Modified: 2020-01-17 15:43 UTC (History)
21 users (show)

Fixed In Version: openstack-ironic-python-agent-1.1.0-10.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1346023 (view as bug list)
Environment:
Last Closed: 2016-06-15 12:39:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
node missing local_gb (63.02 KB, application/x-gzip)
2016-04-27 09:11 UTC, Shang Wu
no flags Details
log from failed introspection (26.02 KB, application/x-gzip)
2016-04-28 15:52 UTC, Ronelle Landy
no flags Details
Trivial patch for retrying if disk not found (1.32 KB, patch)
2016-05-17 12:20 UTC, David Juran
no flags Details | Diff
corrected patch (1.32 KB, patch)
2016-05-18 07:48 UTC, David Juran
no flags Details | Diff
Patch also covering deployment (2.46 KB, patch)
2016-05-18 09:36 UTC, David Juran
no flags Details | Diff
Screenshots (381.99 KB, application/zip)
2016-06-03 20:39 UTC, Yossi Ovadia
no flags Details
/var/log/ironic-inspector/ramdisk/* (2.21 MB, application/x-gzip)
2016-06-06 17:33 UTC, Yossi Ovadia
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1229 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 8 director Bug Fix Advisory 2016-06-15 16:38:45 UTC

Description Ronelle Landy 2016-04-19 18:53:15 UTC
Description of problem:

Running bulk introspection returns:
Introspection completed with errors:
cb6cf397-6150-468f-93d6-3066c56118b7: The following required parameters are missing: ['local_gb']

Running 'ironic node-delete' on all the nodes, readding the nodes and rerunning introspection results in introspection passing. 

Version-Release number of selected component (if applicable):

[stack@rdo-ci-fx2-05-s7 ~]$ rpm -qa | grep openstack
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
openstack-ceilometer-api-5.0.2-2.el7ost.noarch
openstack-heat-api-cloudwatch-5.0.1-5.el7ost.noarch
openstack-swift-plugin-swift3-1.9-1.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-9.el7ost.noarch
openstack-ceilometer-notification-5.0.2-2.el7ost.noarch
openstack-neutron-ml2-7.0.4-1.el7ost.noarch
openstack-keystone-8.0.1-1.el7ost.noarch
openstack-nova-cert-12.0.3-1.el7ost.noarch
openstack-ceilometer-collector-5.0.2-2.el7ost.noarch
openstack-tripleo-0.0.7-1.el7ost.noarch
openstack-swift-2.5.0-2.el7ost.noarch
openstack-ironic-inspector-2.2.5-2.el7ost.noarch
openstack-heat-engine-5.0.1-5.el7ost.noarch
openstack-swift-container-2.5.0-2.el7ost.noarch
openstack-heat-templates-0-0.1.20151019.el7ost.noarch
openstack-tripleo-common-0.3.1-1.el7ost.noarch
openstack-ceilometer-polling-5.0.2-2.el7ost.noarch
openstack-nova-conductor-12.0.3-1.el7ost.noarch
openstack-ironic-common-4.2.2-4.el7ost.noarch
openstack-neutron-openvswitch-7.0.4-1.el7ost.noarch
openstack-heat-common-5.0.1-5.el7ost.noarch
openstack-heat-api-5.0.1-5.el7ost.noarch
openstack-swift-account-2.5.0-2.el7ost.noarch
openstack-swift-proxy-2.5.0-2.el7ost.noarch
openstack-utils-2014.2-1.el7ost.noarch
openstack-glance-11.0.1-4.el7ost.noarch
openstack-tripleo-image-elements-0.9.9-2.el7ost.noarch
openstack-ceilometer-alarm-5.0.2-2.el7ost.noarch
openstack-ironic-conductor-4.2.2-4.el7ost.noarch
openstack-nova-common-12.0.3-1.el7ost.noarch
openstack-neutron-common-7.0.4-1.el7ost.noarch
openstack-neutron-7.0.4-1.el7ost.noarch
python-openstackclient-1.7.2-1.el7ost.noarch
openstack-ironic-api-4.2.2-4.el7ost.noarch
openstack-heat-api-cfn-5.0.1-5.el7ost.noarch
openstack-swift-object-2.5.0-2.el7ost.noarch
openstack-selinux-0.6.58-1.el7ost.noarch
openstack-puppet-modules-7.0.17-1.el7ost.noarch
openstack-ceilometer-common-5.0.2-2.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-9.el7ost.noarch
openstack-ceilometer-central-5.0.2-2.el7ost.noarch
openstack-nova-compute-12.0.3-1.el7ost.noarch
openstack-nova-scheduler-12.0.3-1.el7ost.noarch
openstack-nova-api-12.0.3-1.el7ost.noarch


How reproducible:

Fails about 50% of the time

Steps to Reproduce:
1. On baremetal environment, install OSP-D 8 - latest poodle/puddle
2. Run introspection
3.  If failure, delete nodes, readd them and rerun introspection

Actual results:

Introspection completed with errors:
cb6cf397-6150-468f-93d6-3066c56118b7: The following required parameters are missing: ['local_gb'

Expected results:

Introspection should pass - RAID is setup and rerun will pass

Additional info:

Comment 2 Guillaume Chenuet 2016-04-25 06:50:08 UTC
Hello,

Same here, random nodes are failing with the same error.

All nodes have a RAID1 configuration.

Comment 3 Dmitry Tantsur 2016-04-26 14:58:29 UTC
Hi all!

We need ramdisk logs to debug this issue. Please set always_store_ramdisk_logs to True in /etc/ironic-inspector/inspector.conf, restart openstack-ironic-inspector service and retry. The log files will be in /var/log/ironic-inspector/ramdisk (these are actually tar.gz files, just without an extension).

Comment 4 Shang Wu 2016-04-27 09:11:13 UTC
Created attachment 1151251 [details]
node missing local_gb

Ran into this issue as well. Attach the log.

Comment 5 Shang Wu 2016-04-28 06:28:21 UTC
Looking at the logs, turns out that in some of the systems, I didn't configure the disk for the RAID card. Thus the complain.

Comment 6 Dmitry Tantsur 2016-04-28 12:59:25 UTC
Right, so if you don't have a visible hard drive, we can't do anything :) anyone here has a different source of the problem? please provide logs

Comment 7 Ronelle Landy 2016-04-28 14:54:51 UTC
Still trying to secure a log - the error happens randomly ... 

First attempt at introspection - error is on node 90c6089e-c28e-44c1-b13f-c068ef0c937a:

***************

18:57:32 <rdo-ci-fx2-05-s7.v103.rdoci.lab.eng.rdu.redhat.com> REMOTE_MODULE command source /home/stack/stackrc; openstack baremetal introspection bulk start; #USE_SHELL
19:01:06 failed: [undercloud] => {"changed": true, "cmd": "source /home/stack/stackrc; openstack baremetal introspection bulk start;", "delta": "0:03:27.881051", "end": "2016-04-26 15:01:13.769094", "rc": 1, "start": "2016-04-26 14:57:45.888043", "warnings": []}
19:01:06 stderr: Introspection completed with errors:
19:01:06 90c6089e-c28e-44c1-b13f-c068ef0c937a: The following required parameters are missing: ['local_gb']
19:01:06 stdout: Setting nodes for introspection to manageable...
19:01:06 Starting introspection of node: a07709b2-9fe8-476b-8be2-fa402e72911a
19:01:06 Starting introspection of node: b4a670d6-67d2-4825-94ff-95541914fbfc
19:01:06 Starting introspection of node: a97be236-66bc-4f20-bb95-34efceb2364a
19:01:06 Starting introspection of node: 33cb6272-71f1-4ca3-a2f2-3195df324dcb
19:01:06 Starting introspection of node: 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2
19:01:06 Starting introspection of node: 182b1591-3eac-4692-a131-c73793f49d36
19:01:06 Starting introspection of node: 8d4706e0-6704-4865-b8e3-5dffa19c7150
19:01:06 Starting introspection of node: 90c6089e-c28e-44c1-b13f-c068ef0c937a
19:01:06 Waiting for introspection to finish...
19:01:06 Introspection for UUID a07709b2-9fe8-476b-8be2-fa402e72911a finished successfully.
19:01:06 Introspection for UUID b4a670d6-67d2-4825-94ff-95541914fbfc finished successfully.
19:01:06 Introspection for UUID 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2 finished successfully.
19:01:06 Introspection for UUID a97be236-66bc-4f20-bb95-34efceb2364a finished successfully.
19:01:06 Introspection for UUID 182b1591-3eac-4692-a131-c73793f49d36 finished successfully.
19:01:06 Introspection for UUID 33cb6272-71f1-4ca3-a2f2-3195df324dcb finished successfully.
19:01:06 Introspection for UUID 90c6089e-c28e-44c1-b13f-c068ef0c937a finished with error: The following required parameters are missing: ['local_gb']
19:01:06 Introspection for UUID 8d4706e0-6704-4865-b8e3-5dffa19c7150 finished successfully.
19:01:06 Setting manageable nodes to available...
19:01:06 Node a07709b2-9fe8-476b-8be2-fa402e72911a has been set to available.
19:01:06 Node b4a670d6-67d2-4825-94ff-95541914fbfc has been set to available.
19:01:06 Node a97be236-66bc-4f20-bb95-34efceb2364a has been set to available.
19:01:06 Node 33cb6272-71f1-4ca3-a2f2-3195df324dcb has been set to available.
19:01:06 Node 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2 has been set to available.
19:01:06 Node 182b1591-3eac-4692-a131-c73793f49d36 has been set to available.
19:01:06 Node 8d4706e0-6704-4865-b8e3-5dffa19c7150 has been set to available.

***************

Second attempt at introspection, turn on logging  - Introspection for UUID 90c6089e-c28e-44c1-b13f-c068ef0c937a finished successfully.:

*************

[stack@rdo-ci-fx2-05-s7 ~]$ openstack baremetal introspection bulk start
Setting nodes for introspection to manageable...
Starting introspection of node: a07709b2-9fe8-476b-8be2-fa402e72911a
Starting introspection of node: b4a670d6-67d2-4825-94ff-95541914fbfc
Starting introspection of node: a97be236-66bc-4f20-bb95-34efceb2364a
Starting introspection of node: 33cb6272-71f1-4ca3-a2f2-3195df324dcb
Starting introspection of node: 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2
Starting introspection of node: 182b1591-3eac-4692-a131-c73793f49d36
Starting introspection of node: 8d4706e0-6704-4865-b8e3-5dffa19c7150
Starting introspection of node: 90c6089e-c28e-44c1-b13f-c068ef0c937a
Waiting for introspection to finish...
Introspection for UUID b4a670d6-67d2-4825-94ff-95541914fbfc finished successfully.
Introspection for UUID a97be236-66bc-4f20-bb95-34efceb2364a finished successfully.
Introspection for UUID 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2 finished successfully.
Introspection for UUID 33cb6272-71f1-4ca3-a2f2-3195df324dcb finished successfully.
Introspection for UUID 182b1591-3eac-4692-a131-c73793f49d36 finished successfully.
Introspection for UUID 8d4706e0-6704-4865-b8e3-5dffa19c7150 finished successfully.
Introspection for UUID 90c6089e-c28e-44c1-b13f-c068ef0c937a finished successfully.
Introspection didn't finish for nodes a07709b2-9fe8-476b-8be2-fa402e72911a
Setting manageable nodes to available...
Node b4a670d6-67d2-4825-94ff-95541914fbfc has been set to available.
Node a97be236-66bc-4f20-bb95-34efceb2364a has been set to available.
Node 33cb6272-71f1-4ca3-a2f2-3195df324dcb has been set to available.
Node 50cc77f6-c5ad-44d3-a79c-e7f8995a1bb2 has been set to available.
Node 182b1591-3eac-4692-a131-c73793f49d36 has been set to available.
Node 8d4706e0-6704-4865-b8e3-5dffa19c7150 has been set to available.
Node 90c6089e-c28e-44c1-b13f-c068ef0c937a has been set to available.

**************

Comment 8 Ronelle Landy 2016-04-28 15:52:05 UTC
Created attachment 1151948 [details]
log from failed introspection

Comment 10 Irina Petrova 2016-05-04 10:09:46 UTC
(In reply to Dmitry Tantsur from comment #6)
> Right, so if you don't have a visible hard drive, we can't do anything :)
> anyone here has a different source of the problem? please provide logs

Hey Dmitry,

Did you get a chance to take a peek at the new logs? :)

Best,
Irina

Comment 11 Dmitry Tantsur 2016-05-04 10:25:37 UTC
Irina, please provide the logs requested above.

Ronelle, Shang Wu, interesting, in your case lsblk does not return anything:

Apr 28 11:37:51 localhost.localdomain ironic-python-agent[882]: 2016-04-28 11:37:51.403 882 DEBUG oslo_concurrency.processutils [-] CMD "lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.007s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
Apr 28 11:37:51 localhost.localdomain ironic-python-agent[882]: 2016-04-28 11:37:51.403 882 DEBUG ironic_python_agent.utils [-] Execution completed, command line is "lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:65
Apr 28 11:37:51 localhost.localdomain ironic-python-agent[882]: 2016-04-28 11:37:51.404 882 DEBUG ironic_python_agent.utils [-] Command stdout is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:66
Apr 28 11:37:51 localhost.localdomain ironic-python-agent[882]: 2016-04-28 11:37:51.404 882 DEBUG ironic_python_agent.utils [-] Command stderr is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:67

The same reported by the 'hardware' tool: ["disk", "logical", "count", "0"].

Ronelle, is it possible you have the same problem - missing RAID?

Comment 12 Ronelle Landy 2016-05-09 14:29:20 UTC
Dmitry, I think it would be very possible that we were missing RAID if the error was consistent. As in comment 7 above, in some runs introspection completes without any error and in some, we get the 'missing: ['local_gb']'' error. Is it possible that we get one successful introspection run without RAID? 

If I can provide any other info that may be helpful - including access to the test machines, let me know.

Comment 13 Dmitry Tantsur 2016-05-09 14:42:51 UTC
Adding Imre for his DELL expertise.

Comment 14 Imre Farkas 2016-05-10 08:30:20 UTC
(In reply to Ronelle Landy from comment #12)
> Dmitry, I think it would be very possible that we were missing RAID if the
> error was consistent. As in comment 7 above, in some runs introspection
> completes without any error and in some, we get the 'missing: ['local_gb']''
> error. Is it possible that we get one successful introspection run without
> RAID? 

No, if the missing RAID volume would cause it, it should fail consistently.

Comment 15 Dmitry Tantsur 2016-05-10 08:32:02 UTC
Well, provided that we build RAID consistently... If not, it can cause inconsistent failures. I just don't know other reasons for lsblk to not return anything...

Comment 16 David Juran 2016-05-10 11:25:18 UTC
I've had some _very_ strange outcome on Lenovo (IBM) FlexSystem X240 v1 where the RAID sometimes is there and sometimes is not...

Looking at the ironic-conductor log during deployment (and grepping for the node-id)



2016-05-09 15:04:48.319 1345 ERROR ironic.drivers.modules.agent_base_vendor [-] Asynchronous exception for node ebee9ba8-ad65-44bc-9890-62dc44d8f271: Node failed to get image for deploy. exception: Failed to deploy instance: Failed to start the iSCSI target to deploy the node ebee9ba8-ad65-44bc-9890-62dc44d8f271. Error: {u'message': u'Error finding the disk or partition device to deploy the image onto: No suitable device was found for deployment - root device hints were not provided and none of devices /dev/cciss/c0d0, /dev/sda, /dev/hda, /dev/vda were found', u'code': 404, u'type': u'DeviceNotFound', u'details': u'No suitable device was found for deployment - root device hints were not provided and none of devices /dev/cciss/c0d0, /dev/sda, /dev/hda, /dev/vda were found'}


Do note that this node had introspected successfully at some point...
However, I let the deployment continue (or rather, I didn't spot the problem). And a few minutes later...

2016-05-09 15:20:12.976 1345 DEBUG ironic.drivers.modules.iscsi_deploy [-] Continuing deployment for node ebee9ba8-ad65-44bc-9890-62dc44d8f271, params {'ephemeral_format': None, 'node_uuid': 'ebee9ba8-ad65-44bc-9890-62dc44d8f271', 'image_path': '/var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk', 'address': u'172.25.25.108', 'configdrive': '***', 'port': '3260', 'iqn': 'iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271', 'lun': '1', 'preserve_ephemeral': False, 'boot_option': u'local', 'ephemeral_mb': 0, 'root_mb': 40960, 'swap_mb': 0, 'boot_mode': 'bios'} continue_deploy /usr/lib/python2.7/site-packages/ironic/drivers/modules/iscsi_deploy.py:379
2016-05-09 15:20:12.977 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:13.617 1345 DEBUG oslo_concurrency.processutils [-] CMD "env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk" returned: 0 in 0.640s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:13.618 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:13.618 1345 DEBUG ironic.common.utils [-] Command stdout is: "image: /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk
2016-05-09 15:20:13.683 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p 172.25.25.108:3260 -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 --login execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:13.751 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p 172.25.25.108:3260 -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 --login" returned: 0 in 0.068s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:13.752 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "iscsiadm -m node -p 172.25.25.108:3260 -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 --login" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:13.752 1345 DEBUG ironic.common.utils [-] Command stdout is: "Logging in to [iface: default, target: iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271, portal: 172.25.25.108,3260] (multiple)
Login to [iface: default, target: iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271, portal: 172.25.25.108,3260] successful.
2016-05-09 15:20:13.808 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 -R execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:13.867 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 -R" returned: 0 in 0.059s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:13.867 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "iscsiadm -m node -T iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271 -R" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:13.868 1345 DEBUG ironic.common.utils [-] Command stdout is: "Rescanning session [sid: 24, target: iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271, portal: 172.25.25.108,3260]
2016-05-09 15:20:13.868 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Start destroy disk metadata for node ebee9ba8-ad65-44bc-9890-62dc44d8f271. destroy_disk_metadata /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:482
2016-05-09 15:20:13.869 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:13.944 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36" returned: 0 in 0.075s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:13.944 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:13.945 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf blockdev --getsz /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:13.996 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf blockdev --getsz /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" returned: 0 in 0.050s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:13.996 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "blockdev --getsz /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:13.997 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36 seek=583983068 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:14.061 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36 seek=583983068" returned: 0 in 0.064s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:14.062 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "dd if=/dev/zero of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 bs=512 count=36 seek=583983068" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:14.063 1345 INFO ironic.drivers.modules.deploy_utils [-] Disk metadata on /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 successfully destroyed for node ebee9ba8-ad65-44bc-9890-62dc44d8f271
2016-05-09 15:20:14.067 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Starting to partition the disk device: /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 for node ebee9ba8-ad65-44bc-9890-62dc44d8f271 make_partitions /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:280
2016-05-09 15:20:14.067 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Add config drive partition (1 MB) to device: /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 for node ebee9ba8-ad65-44bc-9890-62dc44d8f271 make_partitions /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:310
2016-05-09 15:20:14.067 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Add root partition (40960 MB) to device: /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 for node ebee9ba8-ad65-44bc-9890-62dc44d8f271 make_partitions /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:319
2016-05-09 15:20:14.068 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf parted -a optimal -s /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 -- unit MiB mklabel msdos mkpart primary  1 2 mkpart primary  2 40962 set 2 boot on execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:14.268 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf parted -a optimal -s /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 -- unit MiB mklabel msdos mkpart primary  1 2 mkpart primary  2 40962 set 2 boot on" returned: 0 in 0.200s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:14.269 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "parted -a optimal -s /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 -- unit MiB mklabel msdos mkpart primary  1 2 mkpart primary  2 40962 set 2 boot on" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:14.270 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:14.498 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" returned: 0 in 0.228s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:14.499 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:15.271 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:15.457 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" returned: 1 in 0.187s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:15.458 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "fuser /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:15.459 1345 INFO ironic.drivers.modules.deploy_utils [-] Successfully completed the disk device /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1 partitioning for node ebee9ba8-ad65-44bc-9890-62dc44d8f271
2016-05-09 15:20:15.459 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Checking for swap device (None) on node ebee9ba8-ad65-44bc-9890-62dc44d8f271. work_on_disk /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:648
2016-05-09 15:20:15.460 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Checking for ephemeral device (None) on node ebee9ba8-ad65-44bc-9890-62dc44d8f271. work_on_disk /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:648
2016-05-09 15:20:15.460 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Checking for configdrive device (/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part1) on node ebee9ba8-ad65-44bc-9890-62dc44d8f271. work_on_disk /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:648
2016-05-09 15:20:15.460 1345 DEBUG ironic.drivers.modules.deploy_utils [-] Checking for efi system partition device (None) on node ebee9ba8-ad65-44bc-9890-62dc44d8f271. work_on_disk /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:648
2016-05-09 15:20:15.461 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/tmp/configdrivegZDaLe of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part1 bs=1M oflag=direct execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:15.525 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/tmp/configdrivegZDaLe of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part1 bs=1M oflag=direct" returned: 0 in 0.064s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:15.526 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "dd if=/tmp/configdrivegZDaLe of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part1 bs=1M oflag=direct" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:15.527 1345 INFO ironic.drivers.modules.deploy_utils [-] Configdrive for node ebee9ba8-ad65-44bc-9890-62dc44d8f271 successfully copied onto partition /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part1
2016-05-09 15:20:15.527 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:20:15.551 1345 DEBUG oslo_concurrency.processutils [-] CMD "env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk" returned: 0 in 0.024s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:20:15.552 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "env LC_ALL=C LANG=C qemu-img info /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:20:15.552 1345 DEBUG ironic.common.utils [-] Command stdout is: "image: /var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk
2016-05-09 15:20:15.553 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2 bs=1M oflag=direct execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:21:13.627 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf dd if=/var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2 bs=1M oflag=direct" returned: 0 in 58.074s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:21:13.628 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "dd if=/var/lib/ironic/images/ebee9ba8-ad65-44bc-9890-62dc44d8f271/disk of=/dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2 bs=1M oflag=direct" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88
2016-05-09 15:21:13.629 1345 INFO ironic.drivers.modules.deploy_utils [-] Image for ebee9ba8-ad65-44bc-9890-62dc44d8f271 successfully populated
2016-05-09 15:21:13.629 1345 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf blkid -s UUID -o value /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
2016-05-09 15:21:13.684 1345 DEBUG oslo_concurrency.processutils [-] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf blkid -s UUID -o value /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2" returned: 0 in 0.055s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
2016-05-09 15:21:13.685 1345 DEBUG ironic.common.utils [-] Execution completed, command line is "blkid -s UUID -o value /dev/disk/by-path/ip-172.25.25.108:3260-iscsi-iqn.2008-10.org.openstack:ebee9ba8-ad65-44bc-9890-62dc44d8f271-lun-1-part2" execute /usr/lib/python2.7/site-packages/ironic/common/utils.py:88



So suddenly the RAID _was_ there again...

Comment 19 David Juran 2016-05-16 13:59:31 UTC
Running the lsblk manually on the introspected node:

[root@localhost ~]# lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE
KNAME="sda" MODEL="Logical Volume  " SIZE="298999349248" ROTA="1" TYPE="disk"

Comment 21 David Juran 2016-05-16 14:20:46 UTC
Seems the problem could be that ironic-python-agent is starting too early, before the disks has been initialized. From the log of a node being introspected:

ay 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.921 893 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE execute /usr
/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
May 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.925 893 DEBUG oslo_concurrency.processutils [-] CMD "lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.004s execute /u
sr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
May 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.925 893 DEBUG ironic_python_agent.utils [-] Execution completed, command line is "lsblk -Pbdi -oKNAME,MODEL,SIZE,ROTA,TYPE" exe
cute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:65
May 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.925 893 DEBUG ironic_python_agent.utils [-] Command stdout is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/
utils.py:66
May 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.926 893 DEBUG ironic_python_agent.utils [-] Command stderr is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/
utils.py:67
May 16 08:52:30 localhost.localdomain ironic-python-agent[893]: 2016-05-16 08:52:28.926 893 DEBUG root [-] TYPE did not match. Wanted: 'disk' but found: '' list_all_block_devices /usr/lib/python2.7/site-packages
/ironic_python_agent/hardware.py:100


But only later does the hardware get initialized:

May 16 08:52:34 localhost.localdomain kernel: mpt2sas0: port enable: SUCCESS
May 16 08:52:34 localhost.localdomain kernel: scsi 0:1:0:0: Direct-Access     LSI      Logical Volume   3000 PQ: 0 ANSI: 6
May 16 08:52:34 localhost.localdomain kernel: scsi 0:1:0:0: RAID1: handle(0x00b4), wwid(0x02bfd48f8a5f329f), pd_count(2), type(SSP)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:1:0:0: qdepth(128), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:0:0: Direct-Access     IBM-ESXS MBF2300RC        SB2F PQ: 0 ANSI: 6
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:0:0: SSP: handle(0x0005), sas_addr(0x50000394780944de), phy(0), device_name(0x50000394780944dd)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:0:0: SSP: enclosure_logical_id(0x500507605640ec2c), slot(0)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:0:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:1:0: Direct-Access     IBM-ESXS MBF2300RC        SB2F PQ: 0 ANSI: 6
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:1:0: SSP: handle(0x0006), sas_addr(0x500003947809bda6), phy(1), device_name(0x500003947809bda5)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:1:0: SSP: enclosure_logical_id(0x500507605640ec2c), slot(1)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:1:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
May 16 08:52:34 localhost.localdomain kernel: scsi 0:1:0:0: Attached scsi generic sg0 type 0
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:0:0: Attached scsi generic sg1 type 0
May 16 08:52:34 localhost.localdomain kernel: scsi 0:0:1:0: Attached scsi generic sg2 type 0
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] 583983104 512-byte logical blocks: (298 GB/278 GiB)
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] 4096-byte physical blocks
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] Write Protect is off
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] Mode Sense: 03 00 00 08
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] No Caching mode page found
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] Assuming drive cache: write through
May 16 08:52:34 localhost.localdomain kernel:  sda: sda1 sda2
May 16 08:52:34 localhost.localdomain kernel: sd 0:1:0:0: [sda] Attached SCSI disk


Do note that a similar problem also seem to be affecting deployment image, i.e. even if introspection succeed, the deployment could fail as it will attempt to start deployment before the disk is ready

Comment 22 David Juran 2016-05-17 12:20:15 UTC
Created attachment 1158264 [details]
Trivial patch for retrying if disk not found

The attached, rather simplistic, patch will re-try the disk inspection if lsblk doesn't return anything.

A limit on the number of retires would probably be a useful addition...

Comment 23 David Juran 2016-05-18 07:48:11 UTC
Created attachment 1158644 [details]
corrected patch

Now with correct indentation

Comment 24 David Juran 2016-05-18 09:36:08 UTC
Created attachment 1158689 [details]
Patch also covering deployment

Turns out there was yet another place which needed patching for deployment to work

Comment 25 Yossi Ovadia 2016-05-18 19:10:18 UTC
How can one test the patch ? must I wait for ready image or there's a way to re-build the image ?
Thanks.

Comment 26 David Juran 2016-05-20 13:31:18 UTC
I think the "real" answers spells "openstack overcloud image build"

For the more time-challenged/lazy of us, there is a slightly more hacky procedure:

1) Unpack the image (as root)
mkdir /tmp/foo
cd /tmp/foo
gzip -dc /httpboot/agent.ramdisk | cpio -id 

2) apply the patches

3) Pack it all up again (as root)
cd /tmp/foo
find . |cpio -H newc -o |gzip -c > /httpboot/agent.ramdisk

Similar procedures apply to the deployment disk, but you need to download it off glance, and then after you've modified it, remove (or move away) the existing deployment image, upload your modified one and run "openstack baremetal configure boot"

Also, do note there is a "proper" patch in the works (same idea, but done properly) in https://bugs.launchpad.net/ironic-python-agent/+bug/1582797 so you're probably better off testing that instead of my hack (-:

Comment 27 Yossi Ovadia 2016-05-20 16:29:48 UTC
Nope, will try you hack first :) 
Thanks !

Comment 32 Yossi Ovadia 2016-06-03 20:39:58 UTC
Created attachment 1164635 [details]
Screenshots

Comment 33 Yossi Ovadia 2016-06-03 20:42:22 UTC
I still experiencing the same problem on my hardware, but with patch suggested by David and with suggested 'proper' patch 

Attaching some screenshots from the console which shows how hardware is detected during vanilla RH installation.

Comment 34 Dmitry Tantsur 2016-06-06 04:24:07 UTC
Hello!

Please attach your ramdisk logs as explained in the comment 3.

Comment 35 Yossi Ovadia 2016-06-06 17:33:40 UTC
Created attachment 1165331 [details]
/var/log/ironic-inspector/ramdisk/*

Logs from /var/log/ironic-inspector/ramdisk using the patch from https://bugzilla.redhat.com/attachment.cgi?id=1158689&action=diff

introspection hangs on - 
[stack@undercloud ~]$ openstack baremetal introspection bulk start
Setting nodes for introspection to manageable...
Starting introspection of node: 2adb62d4-7627-40cc-bc69-f54d4adb80e6
Starting introspection of node: 47d1311e-f78b-456c-90bd-ebef6795cb66
Starting introspection of node: 73ac1e8e-a002-40f0-a265-e7a679eeea9e
Starting introspection of node: 29ca9116-9927-492d-a3ff-7437c0012e68
Waiting for introspection to finish...
Introspection for UUID 29ca9116-9927-492d-a3ff-7437c0012e68 finished with error: The following required parameters are missing: ['local_gb']
Introspection for UUID 47d1311e-f78b-456c-90bd-ebef6795cb66 finished with error: The following required parameters are missing: ['local_gb']

Comment 36 Yossi Ovadia 2016-06-06 19:19:17 UTC
So, since I suspect be2iscsi module was not loaded , I added the following as a test - 

File usr/lib/python2.7/site-packages/ironic_python_agent/hardware.py
(starts at line 52)
def _udev_settle():
    """Wait for the udev event queue to settle.

    Wait for the udev event queue to settle to make sure all devices
    are detected once the machine boots up.

    """
    try:
        utils.execute('udevadm', 'settle')
+        utils.execute('/usr/sbin/modprobe', 'libiscsi','iscsi_boot_sysfs','scsi_transport_iscsi','be2iscsi' )
+        utils.execute('lsmod')


Here you see modprobe being executed follow by lsmod. lsmod does not show iscsi module loaded.

Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.195 1004 DEBUG ironic_python_agent.utils [-] Execution completed, command line is "udevadm settle" execute /usr/lib/python2.7/site-packages/ironic_python
_agent/utils.py:65
Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.195 1004 DEBUG ironic_python_agent.utils [-] Command stdout is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:66
Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.195 1004 DEBUG ironic_python_agent.utils [-] Command stderr is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:67
Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.195 1004 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): /usr/sbin/modprobe libiscsi iscsi_boot_sysfs scsi_transport_iscsi be2iscsi exe
cute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:230
Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.197 1004 DEBUG oslo_concurrency.processutils [-] CMD "/usr/sbin/modprobe libiscsi iscsi_boot_sysfs scsi_transport_iscsi be2iscsi" returned: 0 in 0.002s e
xecute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:260
Jun 06 15:07:31 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.198 1004 DEBUG ironic_python_agent.utils [-] Execution completed, command line is "/usr/sbin/modprobe libiscsi iscsi_boot_sysfs scsi_transport_iscsi be2i
scsi" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:65
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.198 1004 DEBUG ironic_python_agent.utils [-] Command stdout is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:66
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.198 1004 DEBUG ironic_python_agent.utils [-] Command stderr is: "" execute /usr/lib/python2.7/site-packages/ironic_python_agent/utils.py:67
..
..
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.202 1004 DEBUG ironic_python_agent.utils [-] Execution completed, command line is "lsmod" execute /usr/lib/python2.7/site-packages/ironic_python_agent/ut
ils.py:65
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: 2016-06-06 15:07:29.202 1004 DEBUG ironic_python_agent.utils [-] Command stdout is: "Module                  Size  Used by
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: xprtrdma               50777  0
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: sunrpc                300421  1 xprtrdma
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: ib_isert               55331  0
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: iscsi_target_mod      295398  1 ib_isert
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: ib_iser                52048  0
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: ib_srpt                52416  0
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: target_core_mod       371914  3 iscsi_target_mod,ib_srpt,ib_isert
Jun 06 15:07:32 localhost.localdomain ironic-python-agent[1004]: crc_t10dif             12714  1 target_core_mod
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: crct10dif_generic      12647  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_srp                 47430  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: scsi_transport_srp     20725  1 ib_srp
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: scsi_tgt               20027  1 scsi_transport_srp
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_ipoib               91475  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: rdma_ucm               22569  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_ucm                 22546  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_uverbs              47183  2 ib_ucm,rdma_ucm
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_umad                22129  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: rdma_cm                43697  4 xprtrdma,ib_iser,rdma_ucm,ib_isert
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_cm                  46821  5 rdma_cm,ib_srp,ib_ucm,ib_srpt,ib_ipoib
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: iw_cm                  41842  1 rdma_cm
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_sa                  33950  5 rdma_cm,ib_cm,ib_srp,rdma_ucm,ib_ipoib
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_mad                 47817  4 ib_cm,ib_sa,ib_srpt,ib_umad
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ocrdma                 86200  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_core                98787  15 rdma_cm,ib_cm,ib_sa,iw_cm,xprtrdma,ib_mad,ib_srp,ib_ucm,ocrdma,ib_iser,ib_srpt,ib_umad,ib_uverbs,ib_ipoib,ib_isert
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: ib_addr                18923  4 rdma_cm,ocrdma,ib_core,rdma_ucm
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: sr_mod                 22416  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: cdrom                  42556  1 sr_mod
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: usb_storage            66523  0
Jun 06 15:07:33 localhost.localdomain ironic-python-agent[1004]: sg                     40721  0

- iscsi still not there ( why?) and install fails same way. ( Introspection for UUID 1092cb4a-5280-4681-bb74-068eb282b3b1 finished with error: The following required parameters are missing: ['local_gb'] for two servers, and in a loop of 'Disks not ready yet on console ) 


As a reference, installing a vanilla RH on same hardware and doing lsmod : 

[root@manager ~]# lsmod | grep scsi
iscsi_target_mod      295398  1 ib_isert
target_core_mod       371914  3 iscsi_target_mod,ib_srpt,ib_isert
scsi_transport_srp     20725  1 ib_srp
scsi_tgt               20027  1 scsi_transport_srp
iscsi_tcp              18333  0
libiscsi_tcp           25146  1 iscsi_tcp
be2iscsi              121223  5
libiscsi               57233  5 libiscsi_tcp,bnx2i,be2iscsi,iscsi_tcp,ib_iser
scsi_transport_iscsi    99909  6 bnx2i,be2iscsi,iscsi_tcp,ib_iser,libiscsi
iscsi_ibft             12940  0
iscsi_boot_sysfs       16000  2 iscsi_ibft,be2iscsi

Comment 38 Dmitry Tantsur 2016-06-07 14:23:21 UTC
Hi! Lets move the discussion to https://bugzilla.redhat.com/show_bug.cgi?id=1283436 please, as this error is about nodes with actual local disks.

Comment 41 Yossi Ovadia 2016-06-10 21:03:24 UTC
Below works great for agent.ramdisk.

I was unable to do the same for the overcloud images.
my glance list: 
bm-deploy-kernel
bm-deploy-ramdisk
overcloud-full ( Qcow2 ) 
overcloud-full-initrd
overcloud-full-vmlinuz

Tried to glance-save initrd and extract it but it does not seem correct.
which one is the one in use for overcloud deployment ?

when I'll extract it, will I see same file hierarchy as I saw with agent.ramdisk?

Thanks,
Yossi. 

(In reply to David Juran from comment #26)
> I think the "real" answers spells "openstack overcloud image build"
> 
> For the more time-challenged/lazy of us, there is a slightly more hacky
> procedure:
> 
> 1) Unpack the image (as root)
> mkdir /tmp/foo
> cd /tmp/foo
> gzip -dc /httpboot/agent.ramdisk | cpio -id 
> 
> 2) apply the patches
> 
> 3) Pack it all up again (as root)
> cd /tmp/foo
> find . |cpio -H newc -o |gzip -c > /httpboot/agent.ramdisk
> 
> Similar procedures apply to the deployment disk, but you need to download it
> off glance, and then after you've modified it, remove (or move away) the
> existing deployment image, upload your modified one and run "openstack
> baremetal configure boot"
> 
> Also, do note there is a "proper" patch in the works (same idea, but done
> properly) in https://bugs.launchpad.net/ironic-python-agent/+bug/1582797 so
> you're probably better off testing that instead of my hack (-:

Comment 43 errata-xmlrpc 2016-06-15 12:39:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1229


Note You need to log in before you can comment on or make changes to this bug.