Bug 1406856
Summary: | Ironic root_device hint using serial number does not match case | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | chih-hsien.chien | ||||||
Component: | rhosp-director | Assignee: | Derek Higgins <derekh> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Omri Hochman <ohochman> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 10.0 (Newton) | CC: | aschultz, bfournie, chih-hsien.chien, dbecker, derekh, jdonohue, jmelvin, joea, krzysztofx.malkowski, mburns, mlammon, morazi, racedoro, rhel-osp-director-maint, robert.w.love | ||||||
Target Milestone: | Upstream M2 | Keywords: | Triaged | ||||||
Target Release: | 13.0 (Queens) | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-11-27 10:44:02 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1335596, 1409892, 1473267 | ||||||||
Attachments: |
|
Description
chih-hsien.chien
2016-12-21 16:30:23 UTC
Created attachment 1234458 [details]
Logs information
Kernel version: 3.10.0-514.2.2.el7.x86_64 Created attachment 1234459 [details]
Compute node PXE boot screen shoot
Error when undercloud deployment 2017-01-06 10:46:04 - Error: Execution of '/bin/rpm -e firewalld-0.4.3.2-8.el7.noarch' returned 1: error: Failed dependencies: 2017-01-06 10:46:04 - firewalld >= 0.3.5-1 is needed by (installed) anaconda-core-21.48.22.93-1.el7.x86_64 2017-01-06 10:46:04 - firewalld = 0.4.3.2-8.el7 is needed by (installed) firewall-config-0.4.3.2-8.el7.noarch 2017-01-06 10:46:04 - Error: /Stage[main]/Main/Package[firewalld]/ensure: change from 0.4.3.2-8.el7 to absent failed: Execution of '/bin/rpm -e firewalld-0.4.3.2-8.el7.noarch' returned 1: error: Failed dependencies: 2017-01-06 10:46:04 - firewalld >= 0.3.5-1 is needed by (installed) anaconda-core-21.48.22.93-1.el7.x86_64 2017-01-06 10:46:04 - firewalld = 0.4.3.2-8.el7 is needed by (installed) firewall-config-0.4.3.2-8.el7.noarch ---------------------------------------------- Problem solved. I found the root cause. Error was generated by CAPITAL LETTERS in the root disk serial number property. In our case the introspection returns following information about the disk drive: NODE: 92244c85-2e04-47ef-a7ed-8f153249bcea [ { "size": 480103981056, "rotational": false, "vendor": "ATA", ["name": "/dev/sda", "wwn_vendor_extension": null, "wwn_with_extension": "0x55cd2e404c70078b", "model": "INTEL SSDSC2BB48", "wwn": "0x55cd2e404c70078b", "serial": "PHWA60620327480FGN" } ] The "serial" parameter has value "PHWA60620327480FGN". Unfortunately providing this value to the Undercloud database by using command: openstack baremetal node set --property root_device='{"serial": "PHWA60620327480FGN"}' 92244c85-2e04-47ef-a7ed-8f153249bcea will finally generate Overcloud deployment error: [...] CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" [...] to fix the problem you have to provide serial number using only small letters: openstack baremetal node set --property root_device='{"serial": "phwa60620327480fgn"}' 92244c85-2e04-47ef-a7ed-8f153249bcea in that case deployment of the Overcloud will be successfully finished. @RedHat engineers - please fix this problem or provide suitable information in the OSP10 deployment guide. --------------------------------------------------- Root cause can be found in this Bugzilla ID: 1398288 I also see this issue. Can we get an updat on this bugzilla please. ###ironic conductor.log 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor [req-7a5d33c0-50f7-442f-bea4-8c1995eebea1 - - - - -] Asynchronous exception for node 34caa2a8-3018-415c-9801-09ca3424b0ed: Node failed to get image for deploy. Exception: Failed to deploy instance: Failed to start the iSCSI target to deploy the node 34caa2a8-3018-415c-9801-09ca3424b0ed. Error: {u'message': u"Error finding the disk or partition device to deploy the image onto: No suitable device was found for deployment using these hints {u'serial': u'PK2134P6J905GX'}", u'code': 404, u'type': u'DeviceNotFound', u'details': u"No suitable device was found for deployment using these hints {u'serial': u'PK2134P6J905GX'}"} 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor Traceback (most recent call last): 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/agent_base_vendor.py", line 482, in heartbeat 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor self.continue_deploy(task) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 61, in wrapped 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor result = f(*args, **kwargs) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 138, in wrapper 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor return f(*args, **kwargs) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 381, in continue_deploy 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor uuid_dict_returned = do_agent_iscsi_deploy(task, self._client) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 61, in wrapped 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor result = f(*args, **kwargs) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 308, in do_agent_iscsi_deploy 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor raise exception.InstanceDeployFailure(reason=msg) 2017-06-28 11:58:40.997 1490 ERROR ironic.drivers.modules.agent_base_vendor InstanceDeployFailure: Failed to deploy instance: Failed to start the iSCSI target to deploy the node 34caa2a8-3018-415c-9801-09ca3424b0ed. Error: {u'message': u"Error finding the disk or partition device to deploy the image onto: No suitable device was found for deployment using these hints {u'serial': u'PK2134P6J905GX'}", u'code': 404, u'type': u'DeviceNotFound', u'details': u"No suitable device was found for deployment using these hints {u'serial': u'PK2134P6J905GX'}"} [stack@rhel73-osp10-dir swift-data]$ for node in $(ironic node-list | awk '!/UUID/ {print $2}'); do echo "NODE: $node" ; cat inspector_data-$node | jq '.inventory.disks' ; echo "-----" ; done NODE: 34caa2a8-3018-415c-9801-09ca3424b0ed [ { "size": 250059350016, "rotational": false, "vendor": "ATA", "name": "/dev/sda", "wwn_vendor_extension": null, "wwn_with_extension": "0x50025388a04e0300", "model": "Samsung SSD 840", "wwn": "0x50025388a04e0300", "serial": "S1DBNSAF640513M" }, { "size": 2000398934016, "rotational": true, "vendor": "ATA", "name": "/dev/sdb", "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca22de04702", "model": "HGST HUS724020AL", "wwn": "0x5000cca22de04702", "serial": "PK2134P6J905GX" }, { "size": 2000398934016, "rotational": true, "vendor": "ATA", "name": "/dev/sdc", "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca22de008a5", "model": "HGST HUS724020AL", "wwn": "0x5000cca22de008a5", "serial": "PK2134P6J8GKGX" } ] Changing summary to reflect actual issue. Fixes for root device matching are planned for OSP-13. *** Bug 1470405 has been marked as a duplicate of this bug. *** This was never fixed upstream in rhos-10 but has been fixed from RHOS 11 onwards. A patch is being prepared to fix it in RHOS 10 as part of another bug. So I'll close this as a duplicate. *** This bug has been marked as a duplicate of bug 1452226 *** |