Bug 1937809

Summary: Failed to scale worker using virtualmedia on Dell R640
Product: OpenShift Container Platform Reporter: Bob Fournier <bfournie>
Component: Bare Metal Hardware ProvisioningAssignee: Bob Fournier <bfournie>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: Amit Ugol <augol>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: augol, bfournie, derekh, rbartal, sasha, yprokule
Version: 4.7Keywords: Triaged
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1935419 Environment:
Last Closed: 2021-04-20 18:52:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1935419    
Bug Blocks:    

Description Bob Fournier 2021-03-11 15:16:49 UTC
+++ This bug was initially created as a clone of Bug #1935419 +++

Failed to scale worker using virtualmedia on Dell PowerEdge R640

Version:
[kni@r640-u09 ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0     True        False         15h     Cluster version is 4.7.0



IdRAC Firmware Version:	4.22.00.00


Steps to reproduce:

Try to scale workers using virtualmedia.



Result:
The BM node passes inspection, but during provisioning it shows an error like in the attached image.
wipefs error /dev/sdc1 probing initialization failed. Read only file system.

--- Additional comment from Alexander Chuzhoy on 2021-03-04 21:13:43 UTC ---



--- Additional comment from Alexander Chuzhoy on 2021-03-04 21:16:59 UTC ---



--- Additional comment from Alexander Chuzhoy on 2021-03-04 23:52:13 UTC ---



--- Additional comment from Bob Fournier on 2021-03-05 01:27:05 UTC ---

Sasha - thanks for the ramdisk logs.  So for this disk fails cleaning we see:

5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz: Mar 04 15:03:09 localhost.localdomain kernel: sd 15:0:0:0: [sdc] Write Protect is on
5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz: Mar 04 15:03:09 localhost.localdomain kernel: sd 15:0:0:0: [sdc] Mode Sense: 23 00 80 00

2021-03-04 15:03:40.440 1940 WARNING root [-] Could not determine if /dev/sdc1 is a read-only device. Error: [Errno 2] No such file or directory: '/sys/block/sdc1/ro': FileNotFoundError: [Errno 2] No such file or directory: '/sys/block/sdc1/ro'

2021-03-04 15:03:40.533 1940 DEBUG oslo_concurrency.processutils [-] CMD "wipefs --force --all /dev/sdc1" returned: 1 in 0.033s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:416

2021-03-04 15:03:40.685 1940 ERROR root [-] Failed to erase the metadata on device "/dev/sdc1". Error: Unexpected error while running command.
5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz:                                                                                           Command: wipefs --all /dev/sdc1
5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz:                                                                                           Exit code: 1
5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz:                                                                                           Stdout: ''
5b3d383d-350f-4c71-9ee2-e37ae5af905f_cleaning_2021-03-04-20-02-44.tar.gz:                                                                                           Stderr: 'wipefs: error: /dev/sdc1: probing initialization failed: Read-only file system\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.

======================

This isn't one of the NVME drives (see below), it shows as "OEMDRV".  Do you know what drive it is and why Write Protect is on?
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sda1" MODEL="" SIZE="402653184" ROTA="1" TYPE="part"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sda2" MODEL="" SIZE="133169152" ROTA="1" TYPE="part"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sda3" MODEL="" SIZE="1048576" ROTA="1" TYPE="part"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sda4" MODEL="" SIZE="3015687680" ROTA="1" TYPE="part"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sdb" MODEL="Virtual Floppy  " SIZE="" ROTA="1" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sdc" MODEL="OEMDRV          " SIZE="322961408" ROTA="1" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sdc1" MODEL="" SIZE="322960896" ROTA="1" TYPE="part"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="sr0" MODEL="Virtual CD      " SIZE="507875328" ROTA="1" TYPE="rom"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="nvme0n1" MODEL="Dell Express Flash NVMe P4610 1.6TB SFF " SIZE="1600000000000" ROTA="0" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="nvme1n1" MODEL="Dell Express Flash NVMe P4610 1.6TB SFF " SIZE="1600000000000" ROTA="0" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="nvme2n1" MODEL="Dell Express Flash NVMe P4610 1.6TB SFF " SIZE="1600000000000" ROTA="0" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           KNAME="nvme3n1" MODEL="Dell Express Flash NVMe P4610 1.6TB SFF " SIZE="1600000000000" ROTA="0" TYPE="disk"
d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:                                                                                           " execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103

--- Additional comment from Derek Higgins on 2021-03-05 09:38:23 UTC ---

(In reply to Bob Fournier from comment #4)
> This isn't one of the NVME drives (see below), it shows as "OEMDRV".  Do you
> know what drive it is and why Write Protect is on?


> d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:   
> KNAME="sdc" MODEL="OEMDRV          " SIZE="322961408" ROTA="1" TYPE="disk"
> d6a6c8af-4d49-4561-ae87-cafabf752de7_cleaning_2021-03-04-21-01-51.tar.gz:   
> KNAME="sdc1" MODEL="" SIZE="322960896" ROTA="1" TYPE="part"

I havn't seen this before but looking at info online it appears to be a
drive the the host attaches in order to re-install the OEM OS. 

Apparently it gets removes after 18 hours or you can
"restart the server and press F10 to enter the Lifecycle Controller
configuration. Then exit the Lifecycle Controller and reboot again"[1]


1 - http://byronwright.blogspot.com/2014/08/remove-oemdrv-drive-from-dell-server.html

--- Additional comment from Bob Fournier on 2021-03-05 12:11:41 UTC ---

Yeah as Derek found, this drive is unnecessary. We should also be able to unmount it via the iDRAC 9 GUI according to the Dell documentation - https://www.dell.com/support/kbdoc/en-us/000160908/how-to-mount-and-unmount-the-driver-packs-via-idrac9

Comment 4 errata-xmlrpc 2021-04-20 18:52:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.7 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1149