Bug 1415784
| Summary: | bulk introspection doesn't complete when virtualbmc is used | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> | ||||||
| Component: | openstack-ironic-inspector | Assignee: | Dmitry Tantsur <dtantsur> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Alexander Chuzhoy <sasha> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 11.0 (Ocata) | CC: | dtantsur, mburns, mlammon, racedoro, rhel-osp-director-maint, sasha, slinaber, srevivo | ||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||
| Target Release: | 11.0 (Ocata) | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | openstack-ironic-inspector-5.0.1-0.20170214181727.babc2b6.el7ost | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-05-17 19:42:11 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Alexander Chuzhoy
2017-01-23 17:30:18 UTC
Created attachment 1243739 [details]
ironic-inspector logs
Interesting, it looks like introspection actually succeeded (see attached ironic-inspector logs). Will investigate further tomorrow.
Wanted to collect the ramdisk logs, but the thing is that the ramdisk for the very same node , not passing the introspection - doesn't get created. The rest ramdisk logs get created just fine (following the procedure http://tripleo.org/troubleshooting/troubleshooting.html#accessing-logs-from-the-ramdisk) Ok, it seems like one node gets timed out for some reason. Could you please check the node's console when it's booting? Check that it actually reboots, and see at which stage it fails. Thanks! Doing more introspections on that setup I got into situations like: [stack@undercloud ~]$ openstack baremetal introspection bulk status +--------------------------------------+----------+-------+ | Node UUID | Finished | Error | +--------------------------------------+----------+-------+ | c0ed29df-1b05-4d50-a76b-777945b0ec4c | True | None | | 5eb7a8be-9a01-47e9-8a7f-f6839449d4a3 | False | None | | 409cd097-7d2a-4664-bef3-3073729cf597 | True | None | | 5a7fa6e9-5a58-4bd7-bb19-357e421a0e82 | False | None | | b5de4633-e097-4f87-9d2e-fe38a3351127 | False | None | | ab60aeba-a868-40fb-92f2-92313f90b64b | False | None | | b20caa86-51b4-40f8-83a5-53295da56bd7 | True | None | +--------------------------------------+----------+-------+ Where only a subset of nodes would pass introspection. On the console of one of the nodes not passing the introspection I see what's in the attached files (console). Created attachment 1244445 [details]
console
Just deploying latest ocata with quickstart looks as following (after introspection): [stack@undercloud ~]$ ironic node-list +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ | ebf5678f-b69b-4f54-a773-8a3c92793d01 | control-0 | None | power off | manageable | False | | d8def4da-1001-4fc9-b6ee-00566e3d6125 | control-1 | None | power off | manageable | False | | d898ec08-0a1e-4e3b-a500-8e3af2966138 | control-2 | None | power off | manageable | False | | 6f277ba3-3ddf-4edf-9722-e281f2f518bf | compute-0 | None | power off | manageable | False | | 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | compute-1 | None | power on | manageable | False | | 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | compute-2 | None | power off | manageable | False | | 73822262-f63c-4b04-8909-cc306e5ed127 | ceph-0 | None | power off | manageable | False | +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ [stack@undercloud ~]$ openstack baremetal introspection bulk status +--------------------------------------+----------+-------+ | Node UUID | Finished | Error | +--------------------------------------+----------+-------+ | ebf5678f-b69b-4f54-a773-8a3c92793d01 | True | None | | d8def4da-1001-4fc9-b6ee-00566e3d6125 | True | None | | d898ec08-0a1e-4e3b-a500-8e3af2966138 | True | None | | 6f277ba3-3ddf-4edf-9722-e281f2f518bf | True | None | | 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | False | None | | 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | True | None | | 73822262-f63c-4b04-8909-cc306e5ed127 | True | None | +--------------------------------------+----------+-------+ Try setting https://github.com/openstack/ironic-inspector/blob/master/example.conf#L63 to .* and see if it fixes your problem. If not, please paste your nodes.json. Edited the file per comment #9: [root@undercloud ~]# grep introspection_delay_drivers /etc/ironic-inspector/inspector.conf introspection_delay_drivers = ^.*_ssh$ And bounced the openstack-ironic-inspector service: systemctl restart openstack-ironic-inspector.service Restarted bulk introspection. Same result. [stack@undercloud ~]$ cat instackenv.json
{
"nodes": [
{
"name": "control-0",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6230",
"mac": [
"00:50:d0:ad:dd:d9"
],
"cpu": "2",
"memory": "16384",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:control,boot_option:local"
}
,
{
"name": "control-1",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6231",
"mac": [
"00:50:d0:ad:dd:dd"
],
"cpu": "2",
"memory": "16384",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:control,boot_option:local"
}
,
{
"name": "control-2",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6232",
"mac": [
"00:50:d0:ad:dd:e1"
],
"cpu": "2",
"memory": "16384",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:control,boot_option:local"
}
,
{
"name": "compute-0",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6233",
"mac": [
"00:50:d0:ad:dd:e5"
],
"cpu": "2",
"memory": "8192",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:compute,boot_option:local"
}
,
{
"name": "compute-1",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6234",
"mac": [
"00:50:d0:ad:dd:e9"
],
"cpu": "2",
"memory": "8192",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:compute,boot_option:local"
}
,
{
"name": "compute-2",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6237",
"mac": [
"00:50:d0:ad:dd:ed"
],
"cpu": "2",
"memory": "8192",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:compute,boot_option:local"
}
,
{
"name": "ceph-0",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6235",
"mac": [
"00:50:d0:ad:dd:f1"
],
"cpu": "2",
"memory": "8192",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:ceph,boot_option:local"
}
,
{
"name": "ceph-1",
"pm_password": "password",
"pm_type": "pxe_ipmitool",
"pm_user": "admin",
"pm_addr": "127.0.0.1",
"pm_port": "6236",
"mac": [
"00:50:d0:ad:dd:f5"
],
"cpu": "2",
"memory": "8192",
"disk": "50",
"arch": "x86_64",
"capabilities": "profile:ceph,boot_option:local"
}
]
}
> introspection_delay_drivers = ^.*_ssh$
Sorry, I asked to set it to .* but it seems like you've left the default value, no?
Changed the line to read: introspection_delay_drivers = ^.*$ The introspection passed successfully. Are we going to set it as default? Thanks. Yeah, I'm on it. Verified:
Environment:
openstack-ironic-inspector-5.0.1-0.20170214181727.babc2b6.el7ost.noarch
Added 9 nodes to ironic using vbmc and successfully introspected as shown below:
[stack@undercloud-0 ~]$ for i in `openstack baremetal node list -c UUID -f value`; do ironic node-show $i|grep driver_info -A3; done
| driver_info | {u'ipmi_port': u'6231', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6232', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6233', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6237', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6238', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6239', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6234', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6235', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
| driver_info | {u'ipmi_port': u'6236', u'ipmi_username': u'admin', u'deploy_kernel': |
| | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', |
| | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', |
| | u'ipmi_password': u'******'} |
[stack@undercloud-0 ~]$ openstack baremetal introspection bulk start
This command is deprecated. Please use "openstack overcloud node introspect" to introspect manageable nodes instead.
Setting nodes for introspection to manageable...
Starting introspection of manageable nodes
Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: a9ca7e46-edf6-45c3-a191-b5d8e11f0ba1
Waiting for introspection to finish...
Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout.
Introspection for UUID f01bf51d-52ec-4236-8b2f-0c8cd395a8d6 finished successfully.
Introspection for UUID 5ce84ff7-391e-4bdd-9131-f1e111d88bca finished successfully.
Introspection for UUID 7c1811c1-4dbe-4562-8b0b-629fdac26a6d finished successfully.
Introspection for UUID 6b631edb-a9ed-4cec-8756-1a5c26b6e320 finished successfully.
Introspection for UUID f0e347ca-9636-43b3-950e-df0abbabd708 finished successfully.
Introspection for UUID 267594c5-688f-4d9b-98d3-d7fa9a516083 finished successfully.
Introspection for UUID 2c25db0e-8c50-430b-a922-0d0528e42dba finished successfully.
Introspection for UUID e63c4901-3da4-41d1-9a2a-d24816873bd9 finished successfully.
Introspection for UUID 0c2d6bda-e863-4f39-9f5e-97e84b4df9df finished successfully.
Introspection completed.
Setting manageable nodes to available...
Started Mistral Workflow tripleo.baremetal.v1.provide_manageable_nodes. Execution ID: fd85187d-428a-428c-adaa-87229b166c08
Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245 |