Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1860923

Summary: [UPGRADE] after restore from 4.3 to 4.4 the FC luns wiped because ignoredisk not working in rhel-8
Product: Red Hat Enterprise Virtualization Manager Reporter: Kobi Hakimi <khakimi>
Component: DocumentationAssignee: Steve Goodman <sgoodman>
Status: CLOSED CURRENTRELEASE QA Contact: Eli Marcus <emarcus>
Severity: urgent Docs Contact:
Priority: urgent    
Version: unspecifiedCC: aefrat, emarcus, fgarciad, jcall, lsurette, lsvaty, mavital, mhicks, michal.skrivanek, mkalinin, mtessun, nsoffer, pkovar, sgoodman, srevivo
Target Milestone: ovirt-4.4.1Keywords: AutomationBlocker, Documentation, NoDocsQEReview
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
In RHEL 8.2, `ignoredisk --drives` is not recognized by Anaconda in Kickstart files correctly. Consequently, when installing or reinstalling the host’s operating system, it is strongly recommended that you either detach any existing non-OS storage that is attached to the host, or use `ignoredisk --only-use` to avoid accidental initialization of these disks, and with that, potential data loss.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-13 14:34:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1641863, 1862131, 1866243    
Bug Blocks:    

Description Kobi Hakimi 2020-07-27 13:06:13 UTC
Description of problem:
[UPGRADE] after restore from 4.3 to 4.4 the FC luns wiped because ignoredisk not working in rhel-8

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.1.10-0.1.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Build the environment with latest build rhv-4.3.11-6
2. Backup the engine and save the file into external mount point.
3. Set the environment to global maintenance
4. move the first host to maintenance
5. Shutdown HE VM
6. Reprovision the first host with rhel-8.2
this step is important because we doing here the following lines:
...
%include /tmp/ignoredisk.ks
%pre
echo "# Ignore FC drives if present" > /tmp/ignoredisk.ks
fc=`ls -l /dev/disk/by-path/pci-*-fc-* | wc -l`
if [ $fc -gt 0 ]
then
fc_ignore=`ls -1 /dev/disk/by-path/pci-*-fc-* | xargs -I {} realpath {} | tr '\n' ','`
echo "ignoredisk --drives=$fc_ignore" > /tmp/ignoredisk.ks
...
zerombr
clearpart --all --initlabel
...
7. Add RHEL and RHV repositories to this host
8. Install RHV packages ovirt-hosted-engine-setup and rhvm-appliance
9. Get the backup file from external mount point
10. deploy the hosted-engine with restore-from-file option
11. Remove additional hosts one by one and reprovision to rhel 8.2 and add back to RHV
   - Move to maintenance and remove from engine
   - Provision to RHEL-8.2
   - Install the RHEL and RHV repositories
   - Add the host back to the engine as a new host
   - Wait for host until it move to active status with valid HA score.
12. Remove the global maintenance from the environment.
13. Upgrade all the rest(cluster version, datacenter version)


Actual results:
The upgrade succeeded but we can see that the FC luns in non-operational status 
as a result of the "clearpart --all" which erased all the partitions of FC luns that not ignored as part of the provision.

Expected results:
ignoredisk work as expected in rhel-8 and all SD's will be up and running as it was before the upgrade.


Additional info:
related to bug: 
https://bugzilla.redhat.com/show_bug.cgi?id=1641863

Comment 7 Nir Soffer 2020-07-27 13:23:32 UTC
(In reply to Kobi Hakimi from comment #0)
...
> %include /tmp/ignoredisk.ks
> %pre
> echo "# Ignore FC drives if present" > /tmp/ignoredisk.ks
> fc=`ls -l /dev/disk/by-path/pci-*-fc-* | wc -l`
> if [ $fc -gt 0 ]
> then
> fc_ignore=`ls -1 /dev/disk/by-path/pci-*-fc-* | xargs -I {} realpath {} | tr
> '\n' ','`
> echo "ignoredisk --drives=$fc_ignore" > /tmp/ignoredisk.ks

Do we have multipath in the installer? if we do we may need to ignore also 
the multipath devices (e.g. /dev/mapper).

Can we replace the fragile ignore list with simple and more robust include
list? So we specify only the disks that should be used by the installer?

> ...
> zerombr
> clearpart --all --initlabel

Is this something that we do or part of the installer?

Comment 18 Michal Skrivanek 2020-07-28 10:14:45 UTC
Changing to documentation. This is out of scope of RHV, but we should put a warning to upgrade doc about being extra careful not to wipe out SD disks

Comment 21 Martin Tessun 2020-07-28 10:18:57 UTC
Please add a warning to Upgrade and Installation Guide:

Warning: Please be aware that you might have Data-Disks attached to the host that you could destroy if not excluded carefully. Red Hat strongly recommends to detach non-OS storage during re-installation to avoid any chance of data loss during (Re-)Installation of a host.

Comment 26 Steve Goodman 2020-07-29 08:47:19 UTC
Here's the warning I'm adding.

[WARNING]
====
If you are reinstalling a host, {org-fullname} strongly recommends that you first detach any existing non-OS storage that is attached to the host to avoid any chance of data loss during reinstallation.
====

INSTALL GUIDES

I'll add this to the following places (in all 4 install guides):

Preparing Storage [1]

Installing Red Hat Virtualization Hosts [2]

Installing Red Hat Enterprise Linux hosts [3]

UPGRADE GUIDE

Any topics that have to do with upgrading hosts. There's a merge request for the upgrade guide as part of bug 1802650, I'll update it with the new warning.

---------------------------------------------------

[1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4-beta/html-single/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/index#Preparing_Storage_for_RHV_SHE_cockpit_deploy

[2] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4-beta/html-single/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/index#Installing_Red_Hat_Virtualization_Hosts_SHE_deployment_host

[3] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4-beta/html-single/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/index#Installing_Red_Hat_Enterprise_Linux_Hosts_SHE_deployment_host

Comment 28 Steve Goodman 2020-07-29 09:48:46 UTC
Does this warning only apply to 4.4?

Comment 29 Steve Goodman 2020-07-29 09:52:48 UTC
I added the warning to the upgrade guide in the following merge request:

https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1639

Please review

Comment 30 Martin Tessun 2020-07-29 11:16:12 UTC
Sounds good to me.

Comment 32 Kobi Hakimi 2020-07-29 12:14:07 UTC
(In reply to Nir Soffer from comment #7)
> (In reply to Kobi Hakimi from comment #0)
> ...
> > %include /tmp/ignoredisk.ks
> > %pre
> > echo "# Ignore FC drives if present" > /tmp/ignoredisk.ks
> > fc=`ls -l /dev/disk/by-path/pci-*-fc-* | wc -l`
> > if [ $fc -gt 0 ]
> > then
> > fc_ignore=`ls -1 /dev/disk/by-path/pci-*-fc-* | xargs -I {} realpath {} | tr
> > '\n' ','`
> > echo "ignoredisk --drives=$fc_ignore" > /tmp/ignoredisk.ks
> 
> Do we have multipath in the installer? if we do we may need to ignore also 
> the multipath devices (e.g. /dev/mapper).

I tried other options to exclude the FC disks as you can see in details in:
https://bugzilla.redhat.com/show_bug.cgi?id=1641863#c23
https://bugzilla.redhat.com/show_bug.cgi?id=1641863#c32

> 
> Can we replace the fragile ignore list with simple and more robust include
> list? So we specify only the disks that should be used by the installer?

yep, I need to try --onlyuse to include only the desired disks.

> 
> > ...
> > zerombr
> > clearpart --all --initlabel
> 
> Is this something that we do or part of the installer?

Its only part of the ks file

Comment 34 Lukas Svaty 2020-07-29 12:51:59 UTC
Steve if there is anything we can do to make the warning REALLY visible and not to skip while reading through upgrade guide, I would appreciate it. (Not sure about current state, did not find link to docs)

Comment 35 Steve Goodman 2020-07-29 13:48:23 UTC
(In reply to Lukas Svaty from comment #34)
> Steve if there is anything we can do to make the warning REALLY visible and
> not to skip while reading through upgrade guide, I would appreciate it. (Not
> sure about current state, did not find link to docs)

Lukas, please take a look at the previews in the merge requests to see where I added the Warning. Let me know what you think.

The warning text is:

====
When installing or reinstalling the host’s operating system, Red Hat strongly recommends that you first detach any existing non-OS storage that is attached to the host to avoid accidental initialization of these disks, and with that, potential data loss. 
====

and for the two Kickstart examples in the Install Guides:

====
This example assumes that all disks are empty and can be initialized. If you have attached disks with data, either remove them or add them to the `ignoredisks` property.
====

Install Guides:
https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1726

Upgrade Guide:
https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1639

Comment 36 Steve Goodman 2020-07-29 14:18:20 UTC
The doc changes do not affect any procedures, so I'm adding the NoDocsQEReview keyword.

Comment 43 Nir Soffer 2020-07-30 14:15:15 UTC
Detaching LUNs which are not needed by the installed is the only safe way
to avoid wiping shared storage. This is not RHV upgrade or Linux issue, 
customers have wiped RHV LUNs from Windows installer.

For Linux there is a another solution that may be more practical, blacklist
the FC driver kernel module during installation. Maybe we can recommend a list
of known kernel modules that should be blacklisted during installation?

This will not help if the host is booting from SAN, in this case the only
way is tell the installer which disk should be used.

Comment 45 Kobi Hakimi 2020-07-30 15:26:06 UTC
(In reply to Nir Soffer from comment #43)
> Detaching LUNs which are not needed by the installed is the only safe way
> to avoid wiping shared storage. This is not RHV upgrade or Linux issue, 
> customers have wiped RHV LUNs from Windows installer.
> 
> For Linux there is a another solution that may be more practical, blacklist
> the FC driver kernel module during installation. Maybe we can recommend a
> list
> of known kernel modules that should be blacklisted during installation?
> 
> This will not help if the host is booting from SAN, in this case the only
> way is tell the installer which disk should be used.

Steps to workaround this issue:
1. find my FC kernel module by run the following command:
  [root~]# lspci -k | grep -A 3 "Fibre Channel"
  04:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02)
	Subsystem: QLogic Corp. QLE2562 PCI Express to 8Gb FC Dual Channel
	Kernel driver in use: qla2xxx
	Kernel modules: qla2xxx

2. add "module_blacklist=qla2xxx" to the kernel parameters
3. make sure the host in maintenance and FC SD active in RHVM.
4. start provision of the same host with FC disks. 

Result:
 - indeed this ignore the FC disks.
 - as a result prevent the wipe of my FC disks.
 - set the swap not on FC.

Thanks to Nir S. and Yuval T.

Comment 49 Steve Goodman 2020-08-03 13:36:54 UTC
I merged https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1726

So now the Warning will appear in the 4.4 Installation Guide. I still need to merge the Upgrade Guide changes.

Comment 50 Steve Goodman 2020-08-03 15:00:15 UTC
Does this bug need to remain open