Bug 1415784 - bulk introspection doesn't complete when virtualbmc is used
Summary: bulk introspection doesn't complete when virtualbmc is used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic-inspector
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 11.0 (Ocata)
Assignee: Dmitry Tantsur
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-23 17:30 UTC by Alexander Chuzhoy
Modified: 2017-05-17 19:42 UTC (History)
8 users (show)

Fixed In Version: openstack-ironic-inspector-5.0.1-0.20170214181727.babc2b6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-17 19:42:11 UTC


Attachments (Terms of Use)
ironic-inspector logs (2.12 MB, text/plain)
2017-01-23 17:53 UTC, Dmitry Tantsur
no flags Details
console (254.52 KB, image/png)
2017-01-25 21:07 UTC, Alexander Chuzhoy
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC
OpenStack gerrit 425641 None None None 2017-01-27 14:46:10 UTC

Description Alexander Chuzhoy 2017-01-23 17:30:18 UTC
openstack-ironic: bulk introspection doesn't comple [virtualbmc]

Environment:
instack-undercloud-6.0.0-0.20170123140647.25ecc19.el7.centos.noarch
puppet-ironic-10.1.0-0.20170123133130.70e2f98.el7.centos.noarch
python-ironic-inspector-client-1.10.0-0.20161219133602.0eae82e.el7.centos.noarch
python2-ironicclient-1.10.0-0.20170120194459.808a4cb.el7.centos.noarch
openstack-ironic-api-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch
python-ironic-tests-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch
openstack-ironic-inspector-4.2.1-0.20170120195931.59a2009.el7.centos.noarch
openstack-ironic-common-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch
openstack-ironic-conductor-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch
python-ironic-lib-2.5.1-0.20170120192630.4e16718.el7.centos.noarch



Steps to reproduce:
1. Deploy undercloud
2. Register nodes
3. Attempt to introspection:
openstack overcloud node introspect --all-manageable

Result:
All nodes but one complete introspection.
Note: I removed the problematic node from ironic and restarted introspection. Again, all nodes but one (another this time) completed introspection.


Expected result:
The bulk introspection should complete for all nodes.

Comment 2 Dmitry Tantsur 2017-01-23 17:53:35 UTC
Created attachment 1243739 [details]
ironic-inspector logs

Interesting, it looks like introspection actually succeeded (see attached ironic-inspector logs). Will investigate further tomorrow.

Comment 4 Alexander Chuzhoy 2017-01-23 18:01:22 UTC
Wanted to collect the ramdisk logs, but the thing is that the ramdisk for the very same node , not passing the introspection - doesn't get created. The rest ramdisk logs get created just fine (following the procedure http://tripleo.org/troubleshooting/troubleshooting.html#accessing-logs-from-the-ramdisk)

Comment 5 Dmitry Tantsur 2017-01-24 14:12:48 UTC
Ok, it seems like one node gets timed out for some reason. Could you please check the node's console when it's booting? Check that it actually reboots, and see at which stage it fails. Thanks!

Comment 6 Alexander Chuzhoy 2017-01-25 21:06:48 UTC
Doing more introspections on that setup I got into situations like:

[stack@undercloud ~]$ openstack baremetal introspection bulk status
+--------------------------------------+----------+-------+
| Node UUID                            | Finished | Error |
+--------------------------------------+----------+-------+
| c0ed29df-1b05-4d50-a76b-777945b0ec4c | True     | None  |
| 5eb7a8be-9a01-47e9-8a7f-f6839449d4a3 | False    | None  |
| 409cd097-7d2a-4664-bef3-3073729cf597 | True     | None  |
| 5a7fa6e9-5a58-4bd7-bb19-357e421a0e82 | False    | None  |
| b5de4633-e097-4f87-9d2e-fe38a3351127 | False    | None  |
| ab60aeba-a868-40fb-92f2-92313f90b64b | False    | None  |
| b20caa86-51b4-40f8-83a5-53295da56bd7 | True     | None  |
+--------------------------------------+----------+-------+


Where only a subset of nodes would pass introspection.
On the console of one of the nodes not passing the introspection I see what's in the attached files (console).

Comment 7 Alexander Chuzhoy 2017-01-25 21:07:08 UTC
Created attachment 1244445 [details]
console

Comment 8 Alexander Chuzhoy 2017-01-25 22:31:56 UTC
Just deploying latest ocata with quickstart looks as following (after introspection):

[stack@undercloud ~]$ ironic node-list
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name      | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+
| ebf5678f-b69b-4f54-a773-8a3c92793d01 | control-0 | None          | power off   | manageable         | False       |
| d8def4da-1001-4fc9-b6ee-00566e3d6125 | control-1 | None          | power off   | manageable         | False       |
| d898ec08-0a1e-4e3b-a500-8e3af2966138 | control-2 | None          | power off   | manageable         | False       |
| 6f277ba3-3ddf-4edf-9722-e281f2f518bf | compute-0 | None          | power off   | manageable         | False       |
| 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | compute-1 | None          | power on    | manageable         | False       |
| 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | compute-2 | None          | power off   | manageable         | False       |
| 73822262-f63c-4b04-8909-cc306e5ed127 | ceph-0    | None          | power off   | manageable         | False       |
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+



[stack@undercloud ~]$ openstack baremetal introspection bulk status
+--------------------------------------+----------+-------+
| Node UUID                            | Finished | Error |
+--------------------------------------+----------+-------+
| ebf5678f-b69b-4f54-a773-8a3c92793d01 | True     | None  |
| d8def4da-1001-4fc9-b6ee-00566e3d6125 | True     | None  |
| d898ec08-0a1e-4e3b-a500-8e3af2966138 | True     | None  |
| 6f277ba3-3ddf-4edf-9722-e281f2f518bf | True     | None  |
| 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | False    | None  |
| 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | True     | None  |
| 73822262-f63c-4b04-8909-cc306e5ed127 | True     | None  |
+--------------------------------------+----------+-------+

Comment 9 Dmitry Tantsur 2017-01-26 11:23:57 UTC
Try setting https://github.com/openstack/ironic-inspector/blob/master/example.conf#L63 to .* and see if it fixes your problem.

If not, please paste your nodes.json.

Comment 10 Alexander Chuzhoy 2017-01-26 23:45:04 UTC
Edited the file per comment #9:

[root@undercloud ~]# grep introspection_delay_drivers /etc/ironic-inspector/inspector.conf
introspection_delay_drivers = ^.*_ssh$



And bounced the openstack-ironic-inspector service:
systemctl restart openstack-ironic-inspector.service


Restarted bulk introspection.


Same result.

Comment 11 Alexander Chuzhoy 2017-01-26 23:45:59 UTC
[stack@undercloud ~]$ cat instackenv.json 
{
  "nodes": [
      {
      "name": "control-0",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6230",
            "mac": [
        "00:50:d0:ad:dd:d9"
      ],
      "cpu": "2",
      "memory": "16384",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:control,boot_option:local"
    }
        ,
          {
      "name": "control-1",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6231",
            "mac": [
        "00:50:d0:ad:dd:dd"
      ],
      "cpu": "2",
      "memory": "16384",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:control,boot_option:local"
    }
        ,
          {
      "name": "control-2",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6232",
            "mac": [
        "00:50:d0:ad:dd:e1"
      ],
      "cpu": "2",
      "memory": "16384",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:control,boot_option:local"
    }
        ,
          {
      "name": "compute-0",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6233",
            "mac": [
        "00:50:d0:ad:dd:e5"
      ],
      "cpu": "2",
      "memory": "8192",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:compute,boot_option:local"
    }
        ,
          {
      "name": "compute-1",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6234",
            "mac": [
        "00:50:d0:ad:dd:e9"
      ],
      "cpu": "2",
      "memory": "8192",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:compute,boot_option:local"
    }
        ,
          {
      "name": "compute-2",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6237",
            "mac": [
        "00:50:d0:ad:dd:ed"
      ],
      "cpu": "2",
      "memory": "8192",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:compute,boot_option:local"
    }
        ,
          {
      "name": "ceph-0",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6235",
            "mac": [
        "00:50:d0:ad:dd:f1"
      ],
      "cpu": "2",
      "memory": "8192",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:ceph,boot_option:local"
    }
        ,
          {
      "name": "ceph-1",
              "pm_password": "password",
        "pm_type": "pxe_ipmitool",
        "pm_user": "admin",
        "pm_addr": "127.0.0.1",
        "pm_port": "6236",
            "mac": [
        "00:50:d0:ad:dd:f5"
      ],
      "cpu": "2",
      "memory": "8192",
      "disk": "50",
      "arch": "x86_64",
      "capabilities": "profile:ceph,boot_option:local"
    }
        ]
}

Comment 12 Dmitry Tantsur 2017-01-27 08:48:45 UTC
> introspection_delay_drivers = ^.*_ssh$

Sorry, I asked to set it to .* but it seems like you've left the default value, no?

Comment 13 Alexander Chuzhoy 2017-01-27 14:27:10 UTC
Changed the line to read:

introspection_delay_drivers = ^.*$


The introspection passed successfully.

Are we going to set it as default?
Thanks.

Comment 14 Dmitry Tantsur 2017-01-27 14:46:10 UTC
Yeah, I'm on it.

Comment 16 Alexander Chuzhoy 2017-03-09 19:28:20 UTC
Verified:
Environment:
openstack-ironic-inspector-5.0.1-0.20170214181727.babc2b6.el7ost.noarch

Added 9 nodes to ironic using vbmc and successfully introspected as shown below:


[stack@undercloud-0 ~]$ for i in `openstack baremetal node list -c UUID -f value`; do ironic node-show $i|grep driver_info -A3; done
| driver_info            | {u'ipmi_port': u'6231', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6232', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6233', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6237', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6238', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6239', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6234', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6235', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
| driver_info            | {u'ipmi_port': u'6236', u'ipmi_username': u'admin', u'deploy_kernel':    |
|                        | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
|                        | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b',              |
|                        | u'ipmi_password': u'******'}                                             |
[stack@undercloud-0 ~]$ openstack baremetal introspection bulk start
This command is deprecated. Please use "openstack overcloud node introspect" to introspect manageable nodes instead.
Setting nodes for introspection to manageable...
Starting introspection of manageable nodes
Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: a9ca7e46-edf6-45c3-a191-b5d8e11f0ba1
Waiting for introspection to finish...
Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout.
Introspection for UUID f01bf51d-52ec-4236-8b2f-0c8cd395a8d6 finished successfully.
Introspection for UUID 5ce84ff7-391e-4bdd-9131-f1e111d88bca finished successfully.
Introspection for UUID 7c1811c1-4dbe-4562-8b0b-629fdac26a6d finished successfully.
Introspection for UUID 6b631edb-a9ed-4cec-8756-1a5c26b6e320 finished successfully.
Introspection for UUID f0e347ca-9636-43b3-950e-df0abbabd708 finished successfully.
Introspection for UUID 267594c5-688f-4d9b-98d3-d7fa9a516083 finished successfully.
Introspection for UUID 2c25db0e-8c50-430b-a922-0d0528e42dba finished successfully.
Introspection for UUID e63c4901-3da4-41d1-9a2a-d24816873bd9 finished successfully.
Introspection for UUID 0c2d6bda-e863-4f39-9f5e-97e84b4df9df finished successfully.
Introspection completed.
Setting manageable nodes to available...
Started Mistral Workflow tripleo.baremetal.v1.provide_manageable_nodes. Execution ID: fd85187d-428a-428c-adaa-87229b166c08
Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout.

Comment 19 errata-xmlrpc 2017-05-17 19:42:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.