Bug 1324435

Summary: Anaconda exception occurs on iscsi machine
Product: [oVirt] ovirt-node Reporter: cshao <cshao>
Component: Installation & UpdateAssignee: Ryan Barry <rbarry>
Status: CLOSED CURRENTRELEASE QA Contact: cshao <cshao>
Severity: urgent Docs Contact:
Priority: medium    
Version: 4.0CC: anaconda-maint-list, bmcclain, bugs, cshao, fdeutsch, huzhao, leiwang, rvykydal, weiwang, yaniwang, ycui
Target Milestone: ovirt-4.0.1Keywords: TestBlocker, TestOnly
Target Release: ---Flags: rule-engine: ovirt-4.0.z+
rule-engine: blocker+
rule-engine: planning_ack+
rule-engine: devel_ack+
cshao: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: anaconda-21.48.22.65-1 ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061416.iso Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1325134 (view as bug list) Environment:
Last Closed: 2016-07-19 06:24:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1269195, 1325134    
Bug Blocks:    
Attachments:
Description Flags
Anaconda exception
none
all log info
none
complete-iscis.png
none
iscsi1
none
local-pass none

Description cshao 2016-04-06 10:40:55 UTC
Created attachment 1144157 [details]
Anaconda exception

Description of problem:
Anaconda exception occurs on iscsi machine

Version-Release number of selected component (if applicable):
ovirt-node-ng-installer-master-20160405.iso
squashfs.20160405
ovirt-node-ng-image-update-placeholder-4.0.0-0.2.alpha1.20160405123556.gitbd184ec.el7.noarch
imgbased-0.5-0.201604040928gitd6a85f8.el7.centos.noarch
ovirt-release-host-node-4.0.0-0.2.alpha1.20160405123556.gitbd184ec.el7.noarch


How reproducible:
100%

Steps to Reproduce:
1. Boot from PXE and manual install ngn 4.0.
2. Try to finish the installation with correct steps.
3. Focus on the installation process.
4. Please see the attachment for more detail.

Actual results:
Anaconda exception occurs on iscsi machine

Expected results:


Additional info:
No such issue on FC/local machine.

Comment 1 cshao 2016-04-06 10:42:55 UTC
Created attachment 1144171 [details]
all log info

Comment 2 Fabian Deutsch 2016-04-06 20:52:03 UTC
From comment 1 's anaconda.log:

14:13:00,548 INFO anaconda: Installing boot loader
14:13:00,549 DEBUG anaconda: running handleException
14:13:00,550 CRIT anaconda: Traceback (most recent call last):

  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 227, in run
    threading.Thread.run(self, *args, **kwargs)

  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)

  File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 254, in doInstall
    writeBootLoader(storage, payload, instClass, ksdata)

  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 2383, in writeBootLoader
    log.info("boot loader stage1 target device is %s", stage1_device.name)

AttributeError: 'NoneType' object has no attribute 'name'

14:13:00,551 DEBUG anaconda: Gtk running, queuing exception handler to the main loop

Looks like a bug in anaconda.

Any idea, anaconda team?

Comment 3 Red Hat Bugzilla Rules Engine 2016-04-06 20:52:31 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Radek Vykydal 2016-04-07 10:01:48 UTC
14:09:38,712 DEBUG anaconda: stage1 device cannot be on an iSCSI disk
14:09:40,759 DEBUG anaconda: new disk order: []
14:09:40,809 DEBUG anaconda: stage1 device cannot be of type lvmvg
14:09:40,810 DEBUG anaconda: stage1 device cannot be of type lvmthinlv
14:09:40,810 DEBUG anaconda: stage1 device cannot be of type lvmthinpool
14:09:40,811 DEBUG anaconda: stage1 device cannot be of type lvmthinlv
14:09:40,812 DEBUG anaconda: stage1 device cannot be of type lvmlv
14:09:40,813 DEBUG anaconda: stage1 device cannot be on an iSCSI disk
14:09:40,814 DEBUG anaconda: stage1 device cannot be of type partition
14:09:40,814 DEBUG anaconda: stage1 device cannot be of type partition
14:09:40,815 ERR anaconda: BootLoader setup failed: failed to find a suitable stage1 device

Seems like duplicate of bug 1269195, see esp.
https://bugzilla.redhat.com/show_bug.cgi?id=1269195#c23

Comment 5 Radek Vykydal 2016-04-07 10:22:12 UTC
Perhaps we are incorrectly disallowing installing bootloader on offload iSCSI disks.

Comment 6 Fabian Deutsch 2016-04-07 11:02:16 UTC
Thanks Rdaek.

Let's keep this bug for tracking prupose on the RHEV-H side.

Comment 7 Radek Vykydal 2016-04-07 12:23:43 UTC
Would you be able to test this updates image (for RHEL 7.2) with fix?

It can be applied either by boot option:

updates https://rvykydal.fedorapeople.org/updates.iscsioffload.img

or by kickstart command

updates=https://rvykydal.fedorapeople.org/updates.iscsioffload.img

I think if the case is (as I assume from the logs) an installation to a single (partial?) offload iSCSI device we can reassign the bug to Anaconda. I'm asking for assistance with checking the fix as it is not easy to get to the hw.

Comment 8 Radek Vykydal 2016-04-07 13:52:48 UTC
(In reply to Radek Vykydal from comment #7)

Oh sorry,

> Would you be able to test this updates image (for RHEL 7.2) with fix?
> 
> It can be applied either by boot option:
> 
> updates https://rvykydal.fedorapeople.org/updates.iscsioffload.img
> 

this is kickstart command

> or by kickstart command
> 
> updates=https://rvykydal.fedorapeople.org/updates.iscsioffload.img
> 

this is boot option

Comment 9 cshao 2016-04-08 06:37:44 UTC
(In reply to Radek Vykydal from comment #8)
> (In reply to Radek Vykydal from comment #7)
> 
> Oh sorry,
> 
> > Would you be able to test this updates image (for RHEL 7.2) with fix?
> > 
> > It can be applied either by boot option:
> > 
> > updates https://rvykydal.fedorapeople.org/updates.iscsioffload.img
> > 
> 
> this is kickstart command
> 
> > or by kickstart command
> > 
> > updates=https://rvykydal.fedorapeople.org/updates.iscsioffload.img
> > 
> 
> this is boot option

The update image work well on single path iscsi machine, the installation can successful. Please see the attachment "complete-iscis.png" for more detail. 

updates=https://rvykydal.fedorapeople.org/updates.iscsioffload.img

Comment 10 cshao 2016-04-08 06:38:32 UTC
Created attachment 1145011 [details]
complete-iscis.png

Comment 11 Radek Vykydal 2016-04-08 07:24:54 UTC
Should I create a clone for anaconda or are you OK with reassigning the BZ?

Comment 12 Radek Vykydal 2016-04-08 07:31:26 UTC
Also reproducible on storageqe-81.lab.eng.brq.redhat.com.

Comment 13 Fabian Deutsch 2016-04-08 11:27:47 UTC
I'll clone it Radek, because we need to track it from the RHEV side.

Comment 14 Fabian Deutsch 2016-04-08 20:53:30 UTC
Lowering priority, becaus we just track it.

Comment 15 Sandro Bonazzola 2016-05-02 09:53:53 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 16 Douglas Schilling Landgraf 2016-05-04 15:37:25 UTC
Moving to ON_QA as the tracker bz#1325134 already moved.

Comment 17 Ying Cui 2016-05-06 08:48:18 UTC
(In reply to Douglas Schilling Landgraf from comment #16)
> Moving to ON_QA as the tracker bz#1325134 already moved.

anaconda bz#1325134 is targeted to rhel 7.3, we have to request its fix in rhel 7.2.z, so for our ngn, this bug is not completely done, we still can not officially verify this bug on the u/s jenkins build or d/s build.

Comment 18 Fabian Deutsch 2016-05-10 09:29:15 UTC
The fix for this bug did not yet land in the installation iso, thus moving this bug back to MODIFIED

Comment 19 Fabian Deutsch 2016-06-14 19:02:50 UTC
Some patches have been merged into the node anaconda branch, thus I expect that some iSCSI flows do work.

Please re-test this bug with any build equal or greater than the fixed in version build.

Comment 20 cshao 2016-06-15 10:31:45 UTC
Test version:
ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061416.iso
imgbased-0.7.0-0.201606141357gitbd2220e.el7.centos.noarch


Test steps:
1. Boot from iso and anaconda interactive install rhvh 4.0 on iscsi machine.
2. Install rhvh on iscsi lun(automatic or manual).
3. Please see attachment for more details

Test result:
Failed to install on iscsi lun, it report "Failed to find a suitable stage1 device".

NOTE:
If install rhvh on local disk with the same machine, then install can continue.

The machine located on our lab, I press ctrl+ alt + F2~5 can't enter shell mode, so I can't obtain log info. but I will leave test env to here if need to debug.

So I have to assigned this bug.

Comment 21 Red Hat Bugzilla Rules Engine 2016-06-15 10:31:52 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 22 cshao 2016-06-15 10:32:23 UTC
Created attachment 1168307 [details]
iscsi1

Comment 23 cshao 2016-06-15 10:32:53 UTC
Created attachment 1168308 [details]
local-pass

Comment 24 cshao 2016-06-15 10:37:29 UTC
(In reply to shaochen from comment #20)
> Test version:
> ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061416.iso
> imgbased-0.7.0-0.201606141357gitbd2220e.el7.centos.noarch
> 
> 
> Test steps:
> 1. Boot from iso and anaconda interactive install rhvh 4.0 on iscsi machine.
> 2. Install rhvh on iscsi lun(automatic or manual).
> 3. Please see attachment for more details
> 
> Test result:
> Failed to install on iscsi lun, it report "Failed to find a suitable stage1
> device".
> 
> NOTE:
> If install rhvh on local disk with the same machine, then install can
> continue.
> 
> The machine located on our lab, I press ctrl+ alt + F2~5 can't enter shell
> mode, so I can't obtain log info. but I will leave test env to here if need
> to debug.
> 
> So I have to assigned this bug.

The same issue with ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061504.iso.

Comment 25 Fabian Deutsch 2016-06-15 10:38:38 UTC
Radek, can you tell if the testing was done correct?

According to the screenshot no valid stage1 device could be found.

Comment 26 Fabian Deutsch 2016-06-15 10:44:55 UTC
I actually found something:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/appe-iscsi-disks.html

"""
The /boot partition cannot be placed on iSCSI targets which have been added manually using this method - an iSCSI target containing a /boot partition must be configured for use with iBFT. 
"""

This means, that in case iSCSI is used, /boot either needs to be on a local disk or the host needs to support iBFT.

Considering teh screenshot from comment 22, it looks like you only selected an iSCSI LUN, which is not sufficient.

Conclusion: You either need to have a separate disk to hold /boot, or the host neesd to support iBFT.

Please check:
- Does the host support iBFT?
- Retry the testing and place /boot on a local disk

Comment 27 cshao 2016-06-16 03:12:34 UTC
(In reply to Fabian Deutsch from comment #26)
> I actually found something:
> 
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/
> html/Installation_Guide/appe-iscsi-disks.html
> 
> """
> The /boot partition cannot be placed on iSCSI targets which have been added
> manually using this method - an iSCSI target containing a /boot partition
> must be configured for use with iBFT. 
> """
> 
> This means, that in case iSCSI is used, /boot either needs to be on a local
> disk or the host needs to support iBFT.
> 
> Considering teh screenshot from comment 22, it looks like you only selected
> an iSCSI LUN, which is not sufficient.
> 
> Conclusion: You either need to have a separate disk to hold /boot, or the
> host neesd to support iBFT.
> 
> Please check:
> - Does the host support iBFT?
> - Retry the testing and place /boot on a local disk

Thanks fabian, above doc is really helpful to us.
Now the installation can successful on iscsi machine, but we found a new bug.


Test machine(support iBFT):
Server: dell-per-515-01 
Workstation: HP-Z800-02

Test result:
Only selected an iSCSI LUN, anaconda interaction installation can successful on above iscsi machines.




Test machine(don't support iBFT):
Tower: dell-pet-105-01 

Test steps:
1. Place /boot partition on local disk, 
2. Place other partition(/; /home; /var; /swap) on iscsi disk and local disk.


Test result:
Install failed and met Bug 1347088 - An error occurs during anaconda interactive installation on multipath disk machine

But the original bug has been fixed, so I'd like change bug status to VERIFIED.

Comment 28 Sandro Bonazzola 2016-07-19 06:24:01 UTC
Since the problem described in this bug report should be
resolved in oVirt 4.0.1 released on July 19th 2016, it has been closed with a
resolution of CURRENT RELEASE.

For information on the release, and how to update to this release, follow the link below.

If the solution does not work for you, open a new bug report.

http://www.ovirt.org/release/4.0.1/