Bug 2070519

Summary:	[OCP 4.8] Ironic inspector image fails to clean disks that are part of a multipath setup if they are passive paths
Product:	OpenShift Container Platform	Reporter:	Mario Abajo <mabajodu>
Component:	Bare Metal Hardware Provisioning	Assignee:	Iury Gregory Melo Ferreira <imelofer>
Bare Metal Hardware Provisioning sub component:	ironic	QA Contact:	Amit Ugol <augol>
Status:	CLOSED ERRATA	Docs Contact:	Tomas 'Sheldon' Radej <tradej>
Severity:	urgent
Priority:	urgent	CC:	athomas, augol, awolff, ccrum, cgaynor, dhellmann, dmoessne, eglottma, iheim, imelofer, jkreger, jlebon, jsaucier, kurathod, lhh, lmurthy, lshilin, mcornea, nstielau, openshift-bugs-escalate, peasters, pprahlad, rlichti, rpittau, sasha, tsedovic, ykashtan
Version:	4.8	Keywords:	OtherQA, Triaged
Target Milestone:	---
Target Release:	4.8.z
Hardware:	x86_64
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2076622 2089309 (view as bug list)		Environment:
Last Closed:	2022-06-30 16:35:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2089313
Bug Blocks:	2076622

Description Mario Abajo 2022-03-31 10:41:26 UTC

Description of problem:
A IPI baremetal deployment with disks exposed through multiple SAN paths, if one of them is passive, the cleaning process of this disks will fail and will render the clean phase as failed.

Version-Release number of selected component (if applicable):
OCP 4.8.29

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
This was extracted from a failed inspection:

~~~
2022-03-29 07:32:12.322 1 DEBUG ironic.drivers.modules.agent_client [-] Status of agent commands for node 365a2544-613a-4850-b9fe-deb7c62da9f2: get_clean_steps: result "{'clean_steps': {'GenericHardwareManager': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': 'deploy', 'reboot_requested': Fal
se, 'abortable': True}, {'step': 'erase_pstore', 'priority': 0, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'delete_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, 'hardware_manager_version': {'generic_hardware_manager
': '1.1'}}", error "None"; execute_clean_step: result "None", error "{'type': 'CleaningError', 'code': 500, 'message': 'Clean step failed', 'details': 'Error performing clean_step erase_devices_metadata: Error erasing block device: Failed to erase the metadata on the device(s): "/dev/sdd": Unexpected error while running command.\nCommand: dd bs=512 if=/dev/zero of=/dev/sdd count=33\nExit code: 1\nStdout: \'\'\nStderr: "dd: error writing \'/dev/sdd\': Input/output error\\n1+0 records in\\n0+0 records out\\n0 bytes copied, 0.000211819 s, 0.0 kB/s\\n"; "/dev/sdc": Unexpected error while running command.\nCommand: dd bs=512 if=/dev/zero of=/dev/sdc count=33\nExit code: 1\nStdout: \'\'\nStderr: "dd: error writing \'/dev/sdc\': Input/output error\\n1+0 records in\\n0+0 records out\\n0 bytes copied, 0.000180593 s, 0.0 kB/s\\n"'}" get_commands_status /usr/lib/python3.6/site-packages/ironic/drivers/modules/agent_client.py:343
~~~

This is how the disk is seen by the inspector image:

~~~
[   12.807781] scsi 3:0:0:0: Direct-Access     Nimble   Server           1.0  PQ: 0 ANSI: 5
[   12.819297] scsi 3:0:0:0: alua: supports implicit TPGS
[   12.837085] scsi 3:0:0:0: alua: device t10.Nimble  f9004c23913a954c6c9ce90019112fc8 port group 1 rel port 2
[   12.846215] sd 3:0:0:0: Attached scsi generic sg4 type 0
[   12.846726] sd 3:0:0:0: Power-on or device reset occurred
[   12.855666] sd 3:0:0:0: alua: port group 01 state S non-preferred supports tolusna
[   12.873560] sd 3:0:0:0: alua: port group 01 state S non-preferred supports tolusna
[   12.875683] sd 3:0:0:0: [sdd] 419430400 512-byte logical blocks: (215 GB/200 GiB)
[   12.891015] sd 3:0:0:0: [sdd] Write Protect is off
[   12.899652] sd 3:0:0:0: [sdd] Mode Sense: 9b 00 00 08
[   12.899763] sd 3:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   12.918825] Dev sdd: unable to read RDB block 0
[   12.934829]  sdd: unable to read partition table
[   12.955206] sd 3:0:0:0: [sdd] Attached SCSI disk
~~~

More logs:

~~~
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Configuration file /etc/smartmontools/smartd.conf was parsed, found DEVICESCAN, scanning devices
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdb, opened
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdb, [Nimble   Server           1.0 ], lu id: 0xf9004c23913a954c6c9ce90019112fc8, S/N: f9004c23913a954c6c9ce90019112fc8, 214 GB
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdb, NOT READY (e.g. spun down); skip device
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdc, opened
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdc, [Nimble   Server           1.0 ], lu id: 0xf9004c23913a954c6c9ce90019112fc8, S/N: f9004c23913a954c6c9ce90019112fc8, 214 GB
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdc, is SMART capable. Adding to "monitor" list.
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdd, opened
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdd, [Nimble   Server           1.0 ], lu id: 0xf9004c23913a954c6c9ce90019112fc8, S/N: f9004c23913a954c6c9ce90019112fc8, 214 GB
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sdd, same identity as /dev/sdc, ignored
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sde, opened
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sde, [Nimble   Server           1.0 ], lu id: 0xf9004c23913a954c6c9ce90019112fc8, S/N: f9004c23913a954c6c9ce90019112fc8, 214 GB
Mar 30 02:20:15 localhost.localdomain smartd[1166]: Device: /dev/sde, same identity as /dev/sdc, ignored

# lsblk -t
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
sdb          0    512      0     512     512    0 none           5884 128    1G
sdc          0    512      0     512     512    0 none           5884 128    1G
sdd          0    512      0     512     512    0 none           5884 128    1G
sde          0    512      0     512     512    0 none           5884 128    1G
sr0          0   2048      0    2048    2048    1 mq-deadline       2 128    0B
~~~

Expected results:
Ideally it should only erase metadata on one disk, but it is also valid that it deletes metadata from the active devices only (prevents failure), even though that means erasing the same disk twice.

Additional info:

Comment 1 Jonathan Lebon 2022-03-31 13:28:18 UTC

For reference, we've encountered a similar issue in UPI-land related to non-optimized paths. In the end, we added full support for installing to multipath devices. Some links:
- https://docs.openshift.com/container-platform/4.8/installing/installing_bare_metal/installing-bare-metal.html#rhcos-enabling-multipath_installing-bare-metal
- https://github.com/coreos/fedora-coreos-config/pull/1011
- https://github.com/openshift/os/blob/master/docs/faq.md#q-does-rhcos-support-multipath

It seems like Ironic needs to learn to do the same thing here: assemble the multipath, and only manipulate the multipathed device itself and not underlying singular paths. For hooking into RHCOS' support for first-boot multipath, it would also need to add the kernel arguments described in the OpenShift docs above.

Comment 2 Mario Abajo 2022-03-31 14:39:38 UTC

As a quick and dirty solution, can't we just skip devices that cannot be used? at least as a workaround.

Comment 3 Derek Higgins 2022-03-31 15:59:01 UTC

(In reply to Mario Abajo from comment #2)
> As a quick and dirty solution, can't we just skip devices that cannot be
> used? at least as a workaround.

This is technically possible (with a code change to ironic-python-agent) but if we start
ignoring errors cleaning disks then other ironic users could end up leaking data between
customers if we failed to clean a disk and ignored it.

Comment 4 Julia Kreger 2022-03-31 16:07:00 UTC

My $0.02 is just load up multipathd and hopefully it will recognize the san pathing and de-duplicate it. Unfortunately SAN controllers often offer different behavior or sometimes need special configurations which is why we have been shy about incorporating in in to ramdisks by default. I'd personally prefer we don't ignore failed devices due to the data leakage risk being so high when that starts to happen.

Comment 5 Iury Gregory Melo Ferreira 2022-03-31 21:42:08 UTC

Hello everyone,

I've talked with Julia today and we think we need to add a new element to the ramdisk (to be able to identify multipath devices)
I've pushed the upstream change already and we will work on backporting from 4.11 to 4.8

Comment 16 Iury Gregory Melo Ferreira 2022-04-25 01:50:12 UTC

Since using a release image with the modified ironic-ipa-downloader image (which contains https://review.opendev.org/c/openstack/ironic-python-agent/+/837784 )we can try manually updating the ironic-ipa-downloader image after having the cluster.

The procedure I've tested locally and worked is:

After having you deployment you should first check if there no unmanaged resources in you cluster
$ oc get -o json clusterversion version | jq .spec.overrides

After verifying that, make sure that move the cluster-baremetal-operator-images config map to unmanaged, this can be done by running the following command:

$ oc patch clusterversion version --namespace openshift-cluster-version --type merge -p '{"spec":{"overrides":[{"kind":"ConfigMap","group":"v1","name":"cluster-baremetal-operator-images","namespace":"openshift-machine-api","unmanaged":true}]}}'
clusterversion.config.openshift.io/version patched

Check if the clusterversion shows a new resource that should be unmanaged

$ oc get -o json clusterversion version | jq .spec.overrides
[
{
"group": "v1",
"kind": "ConfigMap",
"name": "cluster-baremetal-operator-images",
"namespace": "openshift-machine-api",
"unmanaged": true
}
]

Edit the ConfigMap for the cluster-baremetal-operator-images and change the value of baremetalIpaDownloader to the new image, in our case quay.io/imelofer/ipa-multipath@sha256:cead7e5a6fe9ad2c5027282a8a74ec12224aafe6b4524fd879f25c4ecc996485
$ oc edit ConfigMap cluster-baremetal-operator-images
configmap/cluster-baremetal-operator-images edited

Wait about 3min and double check if the ConfigMap still contains the right config

$ oc describe ConfigMap cluster-baremetal-operator-images | grep "imelofer"


Delete the CBO and wait for the CVO to bring back (after a new CBO starts you can try to add new nodes
to your cluster)

$ oc get pods -n openshift-machine-api
NAME                                         READY STATUS RESTARTS AGE
cluster-autoscaler-operator-78dbcdbf85-hdp44  2/2 Running   0      78m
cluster-baremetal-operator-58b9dd5c45-pfhwd   2/2 Running   1      78m
machine-api-controllers-5bb58fb7bf-lp4fn      7/7 Running   1      73m
machine-api-operator-658749fccf-rq6c8         2/2 Running   1      78m

$ oc delete po cluster-baremetal-operator-58b9dd5c45-pfhwd -n
openshift-machine-api pod "cluster-baremetal-operator-58b9dd5c45-pfhwd" deleted

$ oc get pods -n openshift-machine-api
NAME                                         READY STATUS RESTARTS AGE
cluster-autoscaler-operator-78dbcdbf85-hdp44  2/2  Running   0     79m
cluster-baremetal-operator-58b9dd5c45-72rsp   2/2  Running   0     25s
machine-api-controllers-5bb58fb7bf-lp4fn      7/7  Running   1     74m
machine-api-operator-658749fccf-rq6c8         2/2  Running   1     79m

Comment 17 Riccardo Pittau 2022-04-29 10:06:42 UTC

*** Bug 2077067 has been marked as a duplicate of this bug. ***

Comment 31 Iury Gregory Melo Ferreira 2022-05-24 15:46:13 UTC

Setting target release for this BZ to 4.8.z, since we already have the bugs for 4.11, 4.10, 4.9

Comment 35 Adina Wolff 2022-06-16 09:27:12 UTC

Deployment of 4.8.0-0.nightly-2022-06-15-131405 passed successfully and sanity tests passed.

Comment 38 errata-xmlrpc 2022-06-30 16:35:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.45 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5167

Comment 39 Red Hat Bugzilla 2023-09-15 01:53:29 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days