Bug 1903649 - Automated cleaning is disabled by default
Summary: Automated cleaning is disabled by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.6.z
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.6.z
Assignee: Derek Higgins
QA Contact: Ori Michaeli
URL:
Whiteboard:
Depends On: 1904064
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-02 15:04 UTC by Sai Sindhur Malleni
Modified: 2021-02-08 13:51 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-08 13:50:51 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ironic-image pull 128 0 None closed Bug 1903649: Re-enable automated_clean 2021-02-10 18:40:46 UTC
Red Hat Product Errata RHSA-2021:0308 0 None None None 2021-02-08 13:51:05 UTC

Description Sai Sindhur Malleni 2020-12-02 15:04:02 UTC
Description of problem:
Automated node cleaning is disabled by default. We want this enabled by default to make sure we do not end up with situations like

NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT                                                                                                   
sda                                                                                                     8:0    0 446.6G  0 disk                                                                                                              
├─sda1                                                                                                  8:1    0   384M  0 part /boot                                                                                                        
├─sda2                                                                                                  8:2    0   127M  0 part /boot/efi                                                                                                    
├─sda3                                                                                                  8:3    0     1M  0 part                                                                                                              
├─sda4                                                                                                  8:4    0 446.1G  0 part                                                                                                              
│ └─coreos-luks-root-nocrypt                                                                          253:0    0   446G  0 dm   /sysroot                                                                                                     
└─sda5                                                                                                  8:5    0    65M  0 part                                                                                                              
sdb                                                                                                     8:16   0  14.9G  0 disk                                                                                                              
nvme0n1                                                                                               259:0    0   1.5T  0 disk                                                                                                              
nvme3n1                                                                                               259:1    0   1.5T  0 disk                                                                                                              
nvme1n1                                                                                               259:2    0   1.5T  0 disk                                                                                                              
└─ceph--342e23cd--4600--4c23--aaab--5e5f9524aa90-osd--block--24ec8bdd--1f5d--4af7--8dea--0a988f294bfd 253:1    0   1.5T  0 lvm                                                                                                               
nvme2n1         


on the worker nodes which have been used for other things in the past and still carry disk data/metadata.

[kni@e16-h18-b03-fc640 ansible]$ oc exec -it metal3-568449f7fc-79mkm  -c metal3-ironic-conductor cat /etc/ironic/ironic.conf -n openshift-machine-api | grep automated_clean                                                                 
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.                                                                                                             
automated_clean = false
#automated_clean = true


Version-Release number of selected component (if applicable):
4.6.4

How reproducible:
100%

Steps to Reproduce:
1. Deploy a 4.6 build
2. oc exec -it metal3-568449f7fc-79mkm  -c metal3-ironic-conductor cat /etc/ironic/ironic.conf -n openshift-machine-api | grep automated_clean                 
3.

Actual results:
automated_clean = false

Expected results:
automated_clean = true

Additional info:

Comment 1 Steven Hardy 2020-12-02 15:08:09 UTC
Note that cleaning is enabled by default on 4.7, but it wasn't in 4.6:

https://github.com/openshift/ironic-image/blob/release-4.6/ironic.conf#L33

Comment 2 Derek Higgins 2020-12-02 16:07:32 UTC
Automated clean was disabled because of a bug, https://storyboard.openstack.org/#!/story/2007229,
4.6 is using IPA python3-ironic-python-agent-6.4.1-0.20201103152810.7306c73.el8.noarch which contains a fix for the bug, https://review.opendev.org/c/openstack/ironic-python-agent/+/705062
So should be safe to re-enable auto clean, PR here https://github.com/openshift/ironic-image/pull/128

Comment 3 Derek Higgins 2021-01-18 09:49:54 UTC
increasing priority/severity as we're getting additional reports of this

Comment 8 errata-xmlrpc 2021-02-08 13:50:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0308


Note You need to log in before you can comment on or make changes to this bug.