1466045 – [RFE] Better logic for the default root disk selection

Bug 1466045 - [RFE] Better logic for the default root disk selection

Summary: [RFE] Better logic for the default root disk selection

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-ironic-python-agent
Sub Component:
Version:	13.0 (Queens)
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	OSP Team
QA Contact:	mlammon
Docs Contact:
URL:
Whiteboard:	PerfScale
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-06-28 20:59 UTC by Ben England
Modified:	2022-08-24 10:10 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-08-17 19:43:33 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
tarball containing ironic discovery results (58.90 KB, application/x-gzip) 2017-06-28 20:59 UTC, Ben England	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OSP-2629	0	None	None	None	2022-08-24 10:10:31 UTC

Description Ben England 2017-06-28 20:59:24 UTC

Created attachment 1292701 [details]
tarball containing ironic discovery results

Description of problem:

Ironic introspection database reports incorrectly that root disk is /dev/nvme0n1 when it is /dev/sda.

Version-Release number of selected component (if applicable):

python-ironicclient-1.11.1-1.el7ost.noarch
openstack-ironic-api-7.0.1-1.el7ost.noarch
python-ironic-lib-2.5.2-1.el7ost.noarch
openstack-ironic-common-7.0.1-1.el7ost.noarch
python-ironic-inspector-client-1.11.0-1.el7ost.noarch
puppet-ironic-10.4.0-3.el7ost.noarch
openstack-ironic-conductor-7.0.1-1.el7ost.noarch

How reproducible:

not sure yet.

Steps to Reproduce:
1.  introspect cluster including Dell R610 servers (older model)
2.  openstack baremetal node list
3.  for each uuid: openstack baremetal introspection data save <uuid>

Actual results:

The 4 Dell R610 servers with NVM SSD show this, incorrectly, when I parse the output of command in step 3.  Example:

[stack@gprfc052 introspect_dir]$ jq '.root_disk' c0778747-342a-4abe-9c20-212156615354.params 
{
  "size": 400088457216,
  "serial": "PHFT548200SB400BGN",
  "rotational": false,
  "vendor": null,
  "name": "/dev/nvme0n1",
  "wwn_vendor_extension": null,
  "hctl": null,
  "wwn_with_extension": null,
  "model": "INTEL SSDPEDMD400G4",
  "wwn": null
}

Expected results:

It should show info for /dev/sda as root device.

Additional info:

I'll attach a tarball with the files containing the output of step 3 above.

Comment 1 Red Hat Bugzilla Rules Engine 2017-06-28 20:59:31 UTC

This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 2 Dmitry Tantsur 2017-06-29 09:35:16 UTC

Hi! Why do you think it should use /dev/sda? Have you used root device hints to point at it? Otherwise, ironic-inspector will choose the disk following its internal logic, that does not have to match your expectations.

Comment 3 Ben England 2017-06-30 18:44:57 UTC

Dmitry, sigh, you are correct. Introspection guesses at the smallest disk in the config as the root disk, in this case it was 400-GB /dev/nvme0n1. Real system disk was 1 TB and OSD data disks were 500 GB. Ouch. Furthermore if you do

# openstack baremetal introspection data save $uuid | jq '.extra.disk'

you see that it does not include /dev/nvme0n1 in this list, presumably because it thinks it is a system disk.

Thanks for your clarification. I think I understand why this happened, the question is: should it happen? How well does this system-disk-guessing heuristic work in practice? If it isn't correct almost all the time, what's the point of guessing the system disk if we frequently guess wrong? Guessing wrong will result in lots of failed deployments. We need to make this easier to get right the first time.

Another way these heuristics get us into trouble - Older Dell R610s had virtual CDROM devices and virtual flash devices that, if defined, would come out as smallest devices. Therefore RHOSP 7 (Havana?) would try to deploy to them. Of course no one needs these now, but older servers may still have them.

Maybe it would be better to leave the root_disk uninitialized, and force the user to provide it before a deployment could go forward. For example, the user knows the kind of hardware being deployed on and could provide hints like "choose a system disk that is rotational and is of size 1000 GB", and OOO could search introspection DB, and print each UUID + change the root_disk field where there was a match. This process only has to be done once, regardless of how many deployments are done on the same nodes, as long as the undercloud node survives.

A better guess would be smallest rotational device, I think, although that would have worked terribly in this config as well - it might work for most production-class hardware configurations, where OSD drives are usually bigger than internal system disk.

One clue is the number of drives of a given size. In this case, there is only 1 HDD drives that are 1 TB, and the rest of the HDDs are all 500 GB, this is a hint that the 500-GB drives are intended for something else besides system disk.

Opinions?

Comment 5 Bob Fournier 2017-10-18 22:42:33 UTC

We need to better define the request before we can consider this for inclusion in Queens.

Comment 7 Dmitry Tantsur 2017-12-06 12:05:53 UTC

Renaming the RFE to what it really asks for.

To be honest, I think we should just make root device hints required for TripleO. Changing the defaults is breaking, and it will surely start a big flame war, as everyone has their own idea of defaults ;)

We could probably have several names strategies in IPA, and then a kernel command line option to pick one of them. That will allow us to keep the default as it is, while changing it for TripleO specifically.

That being said, I don't believe our team will have time to work on it in the near future.

Comment 8 Ben England 2019-01-27 16:32:09 UTC

As for defining the request, it's just common sense that you don't put the operating system on the most expensive, high-performance storage device in the system, preventing it from being used for anything else.  Who would object to that?

Comment 9 Sai Sindhur Malleni 2019-10-09 15:06:48 UTC

Hi Dmitry,

Any update here?

Comment 10 Ben England 2019-10-09 15:28:22 UTC

Sai Malleni thought mandatory root device hint was an acceptable solution, I could live with that.    

This problem will continue - for example, if your server has both PCI NVM SSD cards and NVDIMM-N pmem modules, Ironic would choose the smaller /dev/pmem0 device probably, which would be a mistake.

Comment 11 Dmitry Tantsur 2019-10-18 10:39:20 UTC

It's an RFE, so there'll be any update when it gets target somewhere. Given the current priorities, it's unlikely to happen any time soon.

Ben, we can consider excluding pmem devices explicitly if there are not chances anybody would use them for deployment. Please file a separate bug if you believe it's the case.

Comment 12 pweeks 2021-08-17 19:43:33 UTC

given a significant reduction in capacity within the team and age of this RFE, closing wontfix
please open a new rfe with updated requirements should there remain a customer need for future feature

Note You need to log in before you can comment on or make changes to this bug.