Bug 1120650

Summary: RHEV-Hypervisor 7.0 auto install failed
Product: Red Hat Enterprise Virtualization Manager Reporter: cshao <cshao>
Component: ovirt-nodeAssignee: Ryan Barry <rbarry>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.0CC: cshao, dfediuck, ecohen, fdeutsch, gklein, gouyang, hadong, huiwa, iheim, juwu, leiwang, rbalakri, rbarry, yaniwang, ycui
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: node
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1125452 (view as bug list) Environment:
Last Closed: 2015-02-11 21:00:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1097295    
Bug Blocks: 1094719, 1125452, 1147536, 1164308, 1164311    
Attachments:
Description Flags
auto-failed1.png
none
auto-failed2.png
none
fail
none
auto-failed-new
none
ovirt.log
none
ovirt-node.log
none
ovirt.log-0918
none
ovirt-node.log-0918 none

Description cshao 2014-07-17 10:50:33 UTC
Created attachment 918673 [details]
auto-failed1.png

Description of problem:
Hypervisor auto install failed

Entering energency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to USB stick or /boot after mounting them and attach is to a bug report.

Version-Release number of selected component (if applicable):
rhev-hypervisor7-7.0-20140714.0
ovirt-node-3.1.0-0.5.20140711git7197118.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Auto install RHEVH with below parameters:
BOOTIF=etho storage_init=/dev/sda adminpw=XXXXXX firstboot
2.
3.

Actual results:
Hypervisor auto install failed

Expected results:
Hypervisor auto install can successful.

Additional info:

Comment 1 cshao 2014-07-17 10:51:11 UTC
Created attachment 918674 [details]
auto-failed2.png

Comment 2 Ying Cui 2014-07-17 11:22:49 UTC
This bug blocked auto parameters test cases, so add testblocker keywords.

and update the description to BOOTIF=eth0, not BOOTIF=etho

Comment 4 Ryan Barry 2014-07-17 19:25:10 UTC
I'm working on this, but I'd appreciate rdsosreport from a system with rd.debug on the kernel cmdline. This appears to be reproducable with BOOTIF only.

Comment 6 Fabian Deutsch 2014-07-30 13:19:03 UTC
RHEL 7 is using predictive device names. Could you please try to use ens3 or p*p* based names?

Easiest to determine the name is to boot into the TUI installer drop to shell and run ip l.

Comment 7 cshao 2014-07-31 06:35:54 UTC
(In reply to Fabian Deutsch from comment #6)
> RHEL 7 is using predictive device names. Could you please try to use ens3 or
> p*p* based names?
> 
> Easiest to determine the name is to boot into the TUI installer drop to
> shell and run ip l.

Hi fabiand,

The auto installation still got failed with "BOOTIF=ens3 storage_init=/dev/sda firstboot" parameters, but this time with the different error info.

Test version:
rhev-hypervisor7-7.0-20140714.0

Please see attachment "fail.png" for more details.

Comment 8 cshao 2014-07-31 06:36:38 UTC
Created attachment 922827 [details]
fail

Comment 9 Fabian Deutsch 2014-07-31 13:58:41 UTC
Hey Chen,

thanks.

Julie, do we need to document that the "new" device names need to be used? Or is it assumed that the users knows that RHEL7 follows a different NIC naming scheme?

Comment 10 Julie 2014-07-31 22:03:03 UTC
(In reply to Fabian Deutsch from comment #9)
> Hey Chen,
> 
> thanks.
> 
> Julie, do we need to document that the "new" device names need to be used?
> Or is it assumed that the users knows that RHEL7 follows a different NIC
> naming scheme?

Thanks for bringing this bug to my attention. Instructions for RHEV-H 7 will need to be updated. Once updated, we will be requesting a tech review from the engineering team. 

Cheers,
Julie

Comment 11 haiyang,dong 2014-08-07 06:33:30 UTC
Test version:
ovirt-node-3.1.0-0.7.20140806gitef5c5cb.el7.noarch
rhev-hypervisor7-7.0-20140806.1.iso 

autoinstall still failed with the follow error:
[root@dhcp-66-72-90 admin]# python /etc/ovirt-config-boot.d/snmp_autoinstall.py
Traceback (most recent call last):
  File "/etc/ovirt-config-boot.d/snmp_autoinstall.py", line 24, in <module>
    args = system.kernel_cmdline_args()
AttributeError: 'module' object has no attribute 'kernel_cmdline_args'


[root@dhcp-66-72-90 admin]# python /etc/ovirt-config-boot.d/cim_autoinstall.py
Traceback (most recent call last):
  File "/etc/ovirt-config-boot.d/cim_autoinstall.py", line 26, in <module>
    args = system.kernel_cmdline_args()
AttributeError: 'module' object has no attribute 'kernel_cmdline_args'

so need change it into "assigned" again.

Comment 12 cshao 2014-09-05 05:18:07 UTC
Test version:
rhev-hypervisor7-7.0-20140904.0.el7ev
ovirt-node-3.1.0-0.10.20140904gitb828c37.el7.noarch


Test steps:
Clean auto install with "BOOTIF=xxx storage_init=/dev/sda adminpw=xxx enforcing=0 firstboot" still failed, please see attachment for more details.

so need to re-assigned it again.

Comment 13 cshao 2014-09-05 05:19:03 UTC
Created attachment 934683 [details]
auto-failed-new

Comment 14 Fabian Deutsch 2014-09-05 09:19:28 UTC
Chen, could you please attach /var/log.ovirt.log and /var/log/ovirt-node.log

Comment 15 cshao 2014-09-05 09:25:05 UTC
Created attachment 934720 [details]
ovirt.log

Comment 16 cshao 2014-09-05 09:30:18 UTC
Created attachment 934722 [details]
ovirt-node.log

Comment 17 cshao 2014-09-19 06:52:51 UTC
I have to assigned this bug due to auto install still failed, the error same as #c11.

Manual execute snmp*.py no issue, but not sure why tui such issue.
#python /etc/ovirt-config-boot.d/snmp_autoinstall.pyc
#echo $?
0

Teset version
rhev-hypervisor7-7.0-20140918.0.iso
ovirt-node-3.1.0-0.13.20140918gitdda78cb.el7.noarch
vdsm-4.14.13-2.el7ev.x86_64
vdsm-reg-4.14.13-2.el7ev.noarch
ovirt-node-plugin-vdsm-0.1.2-3.el7ev.noarch
libvirt-1.1.1-29.el7_0.1.x86_64

Comment 18 cshao 2014-09-19 06:53:44 UTC
Created attachment 939104 [details]
ovirt.log-0918

Comment 19 cshao 2014-09-19 06:54:50 UTC
Created attachment 939105 [details]
ovirt-node.log-0918

Comment 20 Fabian Deutsch 2014-09-19 14:33:18 UTC
(In reply to shaochen from comment #17)
> I have to assigned this bug due to auto install still failed, the error same
> as #c11.
> 
> Manual execute snmp*.py no issue, but not sure why tui such issue.
> #python /etc/ovirt-config-boot.d/snmp_autoinstall.pyc
> #echo $?
> 0


Right.
I could reproduce a failed auto-installation. But it was a different sympton than before.

My findings: A new persistence code returns different falues for different errors, this made the installer think that some persistence lead to a severe failure. This ha sbeen fixed.
In addition to that there were some selinux denials preventing the normal operation, this has also been addressed.

Tested as follows:

1. Add BOOTIF=ens3 storage_init=/dev/sda
2. Watch the auto-installation taking place ending with a reboot

Comment 22 cshao 2014-09-29 06:18:11 UTC
Test version:
rhev-hypervisor7-7.0-20140926.0.iso
ovirt-node-3.1.0-0.17.20140925git29c3403.el7.noarch

Test result:
Auto install RHEV-H with below parameters can succeed.
BOOTIF=xxx storage_init=/dev/sda adminpw=xxx firstboot

So the bug is fixed, change bug status to VERIFIED.

NOTE:
Also auto install can succeed, but this still have another issue about login password.
User can't login the hypervisor with the new setting password.
Pop-up: Authentication token manipulation error.
I am debugging on it, and will report a new bug to trace this issue.

Thanks!

Comment 26 errata-xmlrpc 2015-02-11 21:00:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0160.html