Bug 1573776 - [blocked] Enter dracut mode after reboot RHVH with FCoE storage
Summary: [blocked] Enter dracut mode after reboot RHVH with FCoE storage
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: redhat-virtualization-host
Version: 4.2.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.5.0
: ---
Assignee: Nir Levy
QA Contact: cshao
URL:
Whiteboard:
Depends On: 1575930
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-02 09:01 UTC by cshao
Modified: 2020-07-09 23:43 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Node
Target Upstream Version:


Attachments (Terms of Use)
rdsosreport.png & journalctl.png (3.21 MB, application/x-gzip)
2018-05-02 09:01 UTC, cshao
no flags Details
/tmp/* grub.cfg (292.62 KB, application/x-gzip)
2018-05-07 11:11 UTC, cshao
no flags Details

Description cshao 2018-05-02 09:01:06 UTC
Created attachment 1429801 [details]
rdsosreport.png & journalctl.png

Description of problem:
Enter dracut mode after reboot RHVH with FCoE storage

Version-Release number of selected component (if applicable):
redhat-virtualization-host-4.2-20180430.0 
fcoe-utils-1.0.32-1.el7.x86_64
imgbased-1.0.14-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Install RHVH on FCoE storage.
2. Specialized & Network disks 
  -> Add a disk 
     -> Add FCoE SAN 
        -> NIC(p5p1/p5p2) 
           -> choose "use auto Vlan"
3. Choose the FCoE boot lun to install RHVH.
4. Choose auto partitioning on Anaconda GUI, and finish other mandatory steps. 
5. Reboot RHVH.

Actual results:
RHVH enter emergency mode.

Expected results:
Login RHVH can successful without error.

Additional info:

Comment 1 Ryan Barry 2018-05-02 10:23:35 UTC
Chen, does this happen if it's not a vlan?

This looks like it may be another 7.5 vlan bug

Comment 2 cshao 2018-05-02 10:30:09 UTC
(In reply to Ryan Barry from comment #1)
> Chen, does this happen if it's not a vlan?
> 
> This looks like it may be another 7.5 vlan bug

Ryan,

We have 2 FCoE machines which both have vlan, is there other ways can be used to debug? If no, I will send ticket to admin for pull out the vlan connection.

Thanks.

Comment 3 Ryan Barry 2018-05-03 02:34:49 UTC
Samantha - any known problems here?

Comment 4 Jiri Konecny 2018-05-07 10:30:59 UTC
Hello,

We have a few FCoE bugs reported so it could be related. However, I'm not able to tell you much from those pictures.

Could you please cshao provide installation logs as plain text files from /tmp/*.log at the end of the installation? Also if you can provide a grub command line used for the failed system boot it would be great.

Thank you.

Comment 5 cshao 2018-05-07 11:11:19 UTC
Created attachment 1432575 [details]
/tmp/*   grub.cfg

Comment 6 Yuval Turgeman 2018-05-07 20:51:57 UTC
I don't really have much experience with FCOE, so I might be way off here - But 2 things I noticed on that machine that might be worth something:

1. AUTO_VLAN in the initrd (/etc/fcoe/cfg-p5p1) is set to "no" and `fcoeadm -i` shows p5p1 as Offline.

2. Running `fipvlan -dcs p5p1` in initrd will start fcoe, and now fcoeadm will show p5p1 as Online and display a bunch of luns there as well.  After this, running lvm_scan will show all the missing lvs under /dev/rhvh_dell-per730-35/

Comment 7 Ryan Barry 2018-05-07 21:40:59 UTC
I think this is very likely.

Or, rather, it's likely the one or both of the following is true:

* fcoe= is not present on the cmdline
* the dracut fcoe hook is missing

Comment 8 Yuval Turgeman 2018-05-08 03:59:18 UTC
fcoe= looks ok in cmdline, don't know about the hook, I'll check today

Comment 9 Yuval Turgeman 2018-05-08 07:30:23 UTC
dracut fcoe modules are there also (from dracut-network), I wonder does this work with a regular RHEL ?

Comment 10 cshao 2018-05-08 10:10:49 UTC
(In reply to Yuval Turgeman from comment #9)
> dracut fcoe modules are there also (from dracut-network), I wonder does this
> work with a regular RHEL ?

Can reproduce this issue with RHEL 75 host.
Already sent test env to you by mail.

Comment 11 Ryan Barry 2018-05-08 10:26:34 UTC
Is this actually a regression in RHV-H? Has this ever worked?

If this is reproducible on RHEL, I'll open a platform bug and we'll block on it

Comment 12 cshao 2018-05-09 02:28:15 UTC
(In reply to Ryan Barry from comment #11)
> Is this actually a regression in RHV-H? 
Not a regression issue.

> Has this ever worked?
No, install RHVH with FCoE storage can successful, but failed to boot(enter dracut).
 

> If this is reproducible on RHEL, I'll open a platform bug and we'll block on
> it
Agree with you.

Comment 13 cshao 2018-05-29 09:56:55 UTC
remove regression keyword according #c12

Comment 14 Ryan Barry 2018-06-05 09:10:32 UTC
Moving out while we wait for Anaconda

Comment 16 Sandro Bonazzola 2020-03-11 08:27:05 UTC
Dependent bug has been dropped from 8.2, not going to be fixed for RHV 4.4

Comment 17 cshao 2020-03-19 13:31:55 UTC
Test version:
redhat-virtualization-host-4.4.0-20200318.0.el8_2
fcoe-utils-1.0.32-7.el8.x86_64
imgbased-1.2.8-1.el8ev.noarch

RHVH can't detect FCOE storage at all.
1. Install RHVH-UNSIGNED-ISO-4.4-RHEL-8-20200318.0-RHVH-x86_64-dvd1.iso via anaconda GUI on FCoE storage machine.
2. Specialized & Network disks 
  -> Add a disk 
     -> Add FCoE SAN 
        -> NIC(p5p1/p5p2) 
           -> choose "use auto Vlan"

Test result:
RHVH can't detect FCOE storage at all.


Note You need to log in before you can comment on or make changes to this bug.