Description of problem: Automated node cleaning is disabled by default. We want this enabled by default to make sure we do not end up with situations like NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 446.6G 0 disk ├─sda1 8:1 0 384M 0 part /boot ├─sda2 8:2 0 127M 0 part /boot/efi ├─sda3 8:3 0 1M 0 part ├─sda4 8:4 0 446.1G 0 part │ └─coreos-luks-root-nocrypt 253:0 0 446G 0 dm /sysroot └─sda5 8:5 0 65M 0 part sdb 8:16 0 14.9G 0 disk nvme0n1 259:0 0 1.5T 0 disk nvme3n1 259:1 0 1.5T 0 disk nvme1n1 259:2 0 1.5T 0 disk └─ceph--342e23cd--4600--4c23--aaab--5e5f9524aa90-osd--block--24ec8bdd--1f5d--4af7--8dea--0a988f294bfd 253:1 0 1.5T 0 lvm nvme2n1 on the worker nodes which have been used for other things in the past and still carry disk data/metadata. [kni@e16-h18-b03-fc640 ansible]$ oc exec -it metal3-568449f7fc-79mkm -c metal3-ironic-conductor cat /etc/ironic/ironic.conf -n openshift-machine-api | grep automated_clean kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. automated_clean = false #automated_clean = true Version-Release number of selected component (if applicable): 4.6.4 How reproducible: 100% Steps to Reproduce: 1. Deploy a 4.6 build 2. oc exec -it metal3-568449f7fc-79mkm -c metal3-ironic-conductor cat /etc/ironic/ironic.conf -n openshift-machine-api | grep automated_clean 3. Actual results: automated_clean = false Expected results: automated_clean = true Additional info:
Note that cleaning is enabled by default on 4.7, but it wasn't in 4.6: https://github.com/openshift/ironic-image/blob/release-4.6/ironic.conf#L33
Automated clean was disabled because of a bug, https://storyboard.openstack.org/#!/story/2007229, 4.6 is using IPA python3-ironic-python-agent-6.4.1-0.20201103152810.7306c73.el8.noarch which contains a fix for the bug, https://review.opendev.org/c/openstack/ironic-python-agent/+/705062 So should be safe to re-enable auto clean, PR here https://github.com/openshift/ironic-image/pull/128
increasing priority/severity as we're getting additional reports of this
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0308