Created attachment 1148969 [details] uefi1-error Description of problem: NGN can't be bootable in UEFI machine due to failed to write boot loader configuration Version-Release number of selected component (if applicable): ovirt-node-ng-installer-ovirt-3.6-2016041820.iso ovirt-node-ng-image-update-placeholder-3.6.5-0.0.master.20160414085328.gitcedfbf3.el7.noarch imgbased-0.6-0.201604150305git1e3b28f.el7.centos.noarch ovirt-release-host-node-3.6.5-0.0.master.20160414085328.gitcedfbf3.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Anaconda install NGN 4.0 in UEFI machine. 2. Focus on the installation process, an error occurred. 3. Ignore this error and continue with installation. 4. Reboot NGN Actual results: 1. Step2: An error occurred while installing the boot loader. 2. Step3: Installation complete, NGN is now successfully installed and ready for use 4. Step4: NGN can't be bootable in UEFI machine due to failed to write boot loader configuration Expected results: NGN can be bootable in UEFI machine. Additional info: Can't obtain log info, if need I will provide the test ENV for debug.
Created attachment 1148970 [details] uefi-2-reboot
Created attachment 1148972 [details] uefi-3-can't boot
Canyou please also provide the log files from the installation?
(In reply to Fabian Deutsch from comment #3) > Canyou please also provide the log files from the installation? I can't obtain log info, but I have sent test env to you by mail. Is there any shortcut key can enter to shell mode for obtain log?
There are some hints here of how to dbeug anaconda problems https://fedoraproject.org/wiki/How_to_debug_installation_problems https://fedoraproject.org/wiki/Anaconda/Logging In general you can press ctrl-alt-f2 to drop to a shell and retrieve log files.
(In reply to Fabian Deutsch from comment #5) > There are some hints here of how to dbeug anaconda problems > > https://fedoraproject.org/wiki/How_to_debug_installation_problems > https://fedoraproject.org/wiki/Anaconda/Logging > > In general you can press ctrl-alt-f2 to drop to a shell and retrieve log > files. Seem can't drop to shell by press ctrl-alt-f2 key, blurred screen appear after press those key, I can't input anything on this screen. please see "blurred_screen.png" I will have a try on other machine, will attach log if I can retrieve log files.
Created attachment 1149070 [details] blurred screen
Please retry with ctrl-alt-f3 or an other f key.
No such issue on local machine, please ctrl-alt-f2 key can drop to shell directly. I will send ticket to lab machine admin to ask him help press those host key directly.
Created attachment 1149353 [details] traceback-log Capture traceback log info, hope it is useful. Continue working on obtain full log.
Created attachment 1149417 [details] all-log
From anaconda-tb: anaconda 21.48.22.56-1 exception report Traceback (most recent call first): File "/tmp/updates/pyanaconda/bootloader.py", line 1573, in write_config raise BootLoaderError("failed to write boot loader configuration") File "/tmp/updates/pyanaconda/bootloader.py", line 1774, in write self.write_config() File "/tmp/updates/pyanaconda/bootloader.py", line 2369, in writeBootLoaderFinal storage.bootloader.write() File "/tmp/updates/pyanaconda/bootloader.py", line 2441, in writeBootLoader writeBootLoaderFinal(storage, payload, instClass, ksdata) File "/tmp/updates/pyanaconda/install.py", line 254, in doInstall writeBootLoader(storage, payload, instClass, ksdata) File "/usr/lib64/python2.7/threading.py", line 764, in run self.__target(*self.__args, **self.__kwargs) File "/tmp/updates/pyanaconda/threads.py", line 227, in run threading.Thread.run(self, *args, **kwargs) BootLoaderError: failed to write boot loader configuration And further down: 18:05:44,759 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg 18:05:47,029 INFO program: /sbin/grub2-mkconfig: line 241: /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory 18:05:47,030 DEBUG program: Return code: 1 Vratislav, have you seen this kind of error?
Actually, I see more: 18:05:42,316 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi"","security.selinux") failed: Operation not supported (95) 18:05:42,318 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI"","security.selinux") failed: Operation not supported (95) 18:05:42,318 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos"","security.selinux") failed: Operation not supported (95) 18:05:42,318 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.gcdx64.efi.YLn5wU"","security.selinux") failed: Operation not supported (95) 18:05:42,319 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.grubenv.x5L8y9"","security.selinux") failed: Operation not supported (95) 18:05:42,320 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.grubx64.efi.KZ5dBo"","security.selinux") failed: Operation not supported (95) 18:05:42,321 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/fonts"","security.selinux") failed: Operation not supported (95) 18:05:42,321 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/fonts/.unicode.pf2.HlymoF"","security.selinux") failed: Operation not supported (95) 18:05:42,321 INFO program: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] 18:05:42,321 DEBUG program: Return code: 23 18:05:43,951 INFO program: Running... new-kernel-pkg --rpmposttrans 3.10.0-327.13.1.el7.x86_64 18:05:44,066 DEBUG program: Return code: 0 18:05:44,601 INFO program: Running... efibootmgr 18:05:44,621 ERR program: Error running efibootmgr: No such file or directory 18:05:44,624 INFO program: Running... grub2-set-default 0 18:05:44,758 DEBUG program: Return code: 0 18:05:44,759 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg 18:05:47,029 INFO program: /sbin/grub2-mkconfig: line 241: /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory 18:05:47,030 DEBUG program: Return code: 1 Initially /mnt/sysimage/boot/efi/EFI/centos is used then /boot/efi/EFI/fedora I'm not concerned about the different prefix (/mnt/sysimage), but about the different OS part (centos cs fedora). Chen, are you sure that the vmlinux, initrd and stage2 are from CentOS?
(In reply to Fabian Deutsch from comment #13) > Actually, I see more: > > 18:05:42,316 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi"","security.selinux") failed: Operation > not supported (95) > 18:05:42,318 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI"","security.selinux") failed: > Operation not supported (95) > 18:05:42,318 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos"","security.selinux") failed: > Operation not supported (95) > 18:05:42,318 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.gcdx64.efi.YLn5wU"","security. > selinux") failed: Operation not supported (95) > 18:05:42,319 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.grubenv.x5L8y9"","security. > selinux") failed: Operation not supported (95) > 18:05:42,320 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/.grubx64.efi.KZ5dBo"", > "security.selinux") failed: Operation not supported (95) > 18:05:42,321 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/fonts"","security.selinux") > failed: Operation not supported (95) > 18:05:42,321 INFO program: rsync: rsync_xal_set: > lsetxattr(""/mnt/sysimage/boot/efi/EFI/centos/fonts/.unicode.pf2.HlymoF"", > "security.selinux") failed: Operation not supported (95) > 18:05:42,321 INFO program: rsync error: some files/attrs were not > transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] > 18:05:42,321 DEBUG program: Return code: 23 > 18:05:43,951 INFO program: Running... new-kernel-pkg --rpmposttrans > 3.10.0-327.13.1.el7.x86_64 > 18:05:44,066 DEBUG program: Return code: 0 > 18:05:44,601 INFO program: Running... efibootmgr > 18:05:44,621 ERR program: Error running efibootmgr: No such file or directory > 18:05:44,624 INFO program: Running... grub2-set-default 0 > 18:05:44,758 DEBUG program: Return code: 0 > 18:05:44,759 INFO program: Running... grub2-mkconfig -o > /boot/efi/EFI/fedora/grub.cfg > 18:05:47,029 INFO program: /sbin/grub2-mkconfig: line 241: > /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory > 18:05:47,030 DEBUG program: Return code: 1 > > Initially > /mnt/sysimage/boot/efi/EFI/centos > is used > then > /boot/efi/EFI/fedora > > I'm not concerned about the different prefix (/mnt/sysimage), but about the > different OS part (centos cs fedora). > > Chen, are you sure that the vmlinux, initrd and stage2 are from CentOS? Yes. I tested this issue on UEFI machine with virtual-media attached, they are from CnetOS.
Thanks. Vratislav, do you have any idea on this one?
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Please retest this with teh upstream build. It is important to: 1. Use kernel + initrd from centos7 repos 2. Use stage2 squashfs from centos7 repos
this looks like some weird mixture of fedora and centos stuff
(In reply to Fabian Deutsch from comment #18) > Please retest this with teh upstream build. > > It is important to: > 1. Use kernel + initrd from centos7 repos > 2. Use stage2 squashfs from centos7 repos We can not access uefi remote console now, I will reply the needinfo asap.
(In reply to Fabian Deutsch from comment #18) > Please retest this with teh upstream build. > > It is important to: > 1. Use kernel + initrd from centos7 repos > 2. Use stage2 squashfs from centos7 repos Anaconda exception occurs this time. ============================================================================ 16:26:14,448 INFO anaconda: Installing boot loader 16:26:14,449 INFO anaconda: boot loader stage1 target device is sda1 16:26:14,449 INFO anaconda: boot loader stage2 target device is sda2 16:26:14,450 DEBUG anaconda: new default image: <pyanaconda.bootloader.LinuxBootLoaderImage object at 0x504bcd0> 16:26:14,628 INFO anaconda: bootloader.py: used boot args: crashkernel=auto rd.lvm.lv=centos00/root00 rd.lvm.lv=centos00/swap rhgb quiet 16:26:18,655 DEBUG anaconda: running handleException 16:26:18,657 CRIT anaconda: Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 227, in run threading.Thread.run(self, *args, **kwargs) File "/usr/lib64/python2.7/threading.py", line 764, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 254, in doInstall writeBootLoader(storage, payload, instClass, ksdata) File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 2440, in writeBootLoader writeBootLoaderFinal(storage, payload, instClass, ksdata) File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 2368, in writeBootLoaderFinal storage.bootloader.write() File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1771, in write self.install() File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1750, in install self.remove_efi_boot_target() File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1706, in remove_efi_boot_target buf = self.efibootmgr(capture=True) File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1699, in efibootmgr return exec_func("efibootmgr", list(args), **kwargs) File "/usr/lib64/python2.7/site-packages/pyanaconda/iutil.py", line 345, in execWithCapture filter_stderr=filter_stderr)[1] File "/usr/lib64/python2.7/site-packages/pyanaconda/iutil.py", line 259, in _run_program env_prune=env_prune) File "/usr/lib64/python2.7/site-packages/pyanaconda/iutil.py", line 185, in startProgram preexec_fn=preexec, cwd=root, env=env, **kwargs) File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory 16:26:18,658 DEBUG anaconda: Gtk running, queuing exception handler to the main loop 16:26:18,658 INFO anaconda: Thread Done: AnaInstallThread (139766683113216) 16:26:48,266 INFO anaconda: Running kickstart %%traceback script(s) 16:26:48,266 INFO anaconda: All kickstart %%traceback script(s) have been run ============================================================================ Test version: ovirt-node-ng-installer-master-2016050300.iso imgbased-0.6-0.201604241653git1e3b28f.el7.centos.noarch ovirt-release-host-node-4.0.0-0.3.master.20160428135304.git037679a.el7.noarch ovirt-node-ng-image-update-placeholder-4.0.0-0.3.master.20160428135304.git037679a.el7.noarch 1. Anaconda install NGN 4.0 in UEFI machine. 2. Focus on the installation process, an error occurred.
efibootmgr was not included, and is necessary. There's a patch which resolves this (as soon as it's merged, the next build will work). However, it's necessary to match it with the appropriate stage2 and/or ISO. I'm not sure if this is or should be an Anaconda bug, but using a RHEL boot ISO as the source with a centos squashfs will lead to a failure during bootloader installation (/boot/efi/EFI/centos instead of /boot/efi/EFI/redhat). The same with Fedora, which is the likely cause of the earlier traceback. I'm guessing that the Anaconda stage2 assumes the pathnames will match, but efibootmgr/grub2-efi are run outside the context of the install root, and the wrong paths are inferred. Anaconda distro == install distro is a fair assumption, but it's a caveat we need to be aware of for ngn.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Still can reproduce this issue on ovirt-node-ng-installer-master-2016051100 build.
I'll check this again, since I verified the fix (in derive-boot-iso and the addition of efibootmgr). Can you please attach new logs?
Created attachment 1157802 [details] uefi-new-all-logs
I cannot reproduce this locally. This appears to be a problem specific to Jenkins builds. I'm looking into why. Workaround: Download a CentOS 7 boot ISO Get ovirt-node-ng-image.squashfs.img from jenkins Clone ovirt-node-ng /path/to/ovirt-node-ng/scripts/derive-boot-iso.sh /path/to/centos.iso /path/to/squashfs.img ./ovirt-node-ng.iso An ISO generated this way will work.
Note: The uploaded logs don't work. In testing locally (from jenkins master), I got the same error about /EFI/fedora... as earlier, which leads me to believe that some Fedora parts are being injected somewhere in Jenkins, though I can't find it in the logs or git (I need to test using the ovirt-node-ng-tools RPM) My development workstation is Fedora, so being in a Fedora root upstream shouldn't affect this, as it works locally...
Created attachment 1157943 [details] re-upload-all-log
(In reply to shaochen from comment #30) > Created attachment 1157943 [details] > re-upload-all-log This shows the same error: 15:28:14,742 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg 15:28:15,439 INFO program: /sbin/grub2-mkconfig: line 241: /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory Please try with the workaround from comment#27
(In reply to Ryan Barry from comment #31) > (In reply to shaochen from comment #30) > > Created attachment 1157943 [details] > > re-upload-all-log > > This shows the same error: > > 15:28:14,742 INFO program: Running... grub2-mkconfig -o > /boot/efi/EFI/fedora/grub.cfg > 15:28:15,439 INFO program: /sbin/grub2-mkconfig: line 241: > /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory > > Please try with the workaround from comment#27 I did the following testing with below Workaround: Download a CentOS 7 boot ISO Get ovirt-node-ng-image.squashfs.img (20160511_0_el7_noarch_rpm-master)from jenkins Clone ovirt-node-ng /path/to/ovirt-node-ng/scripts/derive-boot-iso.sh /path/to/centos.iso /path/to/squashfs.img ./ovirt-node-ng.iso Test steps: 1. Anaconda install NGN 4.0 in UEFI machine. 2. Try to finish the installation. 3. Reboot NGN 4. Try to boot NGN. Test result: 1. After step 2,3, NGN install can successful on UEFI machine. 2. NGN boot failed on UEFI machine. Additional info: I did the same testing using vintage RHEV-H can boot successful on the same uefi machine.
Created attachment 1158199 [details] UEFI-BOOT
Created attachment 1158200 [details] UEFI-BOOT-FAILED
Please attach log files for the failed attempt in comment 32.
shim.efi appears to be missing. I'm doing a build to test this, and I'll verify that it boots after installation...
I've verified that including both efibootmgr and shim produces an image which is installable and bootable. I'll update redhat-release-rhev-hypervisor as well...
(In reply to Fabian Deutsch from comment #35) > Please attach log files for the failed attempt in comment 32. After confirmed with fabian by IRC, this bug seem has been fixed and the final gold signed NGN squashfs and NGN iso is planned for tonight. so no need the log to reproduce the bug anymore. cancel the needinfo. Thanks.
Test version: ovirt-node-ng-installer-master-2016061511.iso Test steps: 1. Anaconda install NGN 4.0 in UEFI machine. 2. Try to finish the installation(but met error like screen-shot"uefi" show). 3. Try to ignore the error and continue install. 4. Still got failed. Test result: Still got failed. (There was an error running the ks at line 6. This is fatal error and installation will be aborted.) I can't provide more log info due to press ctrl+alt+F2~5 can't enter into shell mode, but I will try to obtain the log via another way asap. Change bug status to ASSIGNED.
Created attachment 1168587 [details] june-16-uefi1
Created attachment 1168588 [details] june-16-uefi2
Created attachment 1168589 [details] june-16-uefi3
The lgos from the installation are needed, a screenshot is not sufficient to debug the issue.
(In reply to Fabian Deutsch from comment #44) > The lgos from the installation are needed, a screenshot is not sufficient to > debug the issue. Actually, the logs aren't necessary either. This will work downstream, but the problem upstream (or with CentOS images) is that the installclass sets efi_dir to "redhat" (adding "oVirt Node" was a relatively quick fix, and efi_dir was not set) This will work tomorrow: https://github.com/evol262/anaconda/commit/2e879667a618d4c6f37def1f477f26acac16b780
Good catch, tho I have a comment on the implementation.
(In reply to Ryan Barry from comment #45) > (In reply to Fabian Deutsch from comment #44) > > The lgos from the installation are needed, a screenshot is not sufficient to > > debug the issue. > > Actually, the logs aren't necessary either. > > This will work downstream, but the problem upstream (or with CentOS images) > is that the installclass sets efi_dir to "redhat" (adding "oVirt Node" was a > relatively quick fix, and efi_dir was not set) > > This will work tomorrow: > > https://github.com/evol262/anaconda/commit/ > 2e879667a618d4c6f37def1f477f26acac16b780 Thanks Ryan, Cancel the needinfo due to the logs aren't necessary.
(In reply to Ryan Barry from comment #45) > (In reply to Fabian Deutsch from comment #44) > > The lgos from the installation are needed, a screenshot is not sufficient to > > debug the issue. > > Actually, the logs aren't necessary either. > > This will work downstream, but the problem upstream (or with CentOS images) > is that the installclass sets efi_dir to "redhat" (adding "oVirt Node" was a > relatively quick fix, and efi_dir was not set) > > This will work tomorrow: > > https://github.com/evol262/anaconda/commit/ > 2e879667a618d4c6f37def1f477f26acac16b780 Test version: rhev-hypervisor7-ng-4.0-20160616.0.x86_64 imgbased-0.7.0-0.1.el7ev.noarch redhat-release-rhev-hypervisor-4.0-0.7.el7.x86_64 Test result: Still got failed with above downstream build. details info please refer "uefi" log.
Created attachment 1169744 [details] uefi.log
Created attachment 1169745 [details] uefi.png
The screenshot in comment 50 looks like a different bug, Please provide the anaconda logs.
Created attachment 1169804 [details] /tmp/all_log
From the logs: RuntimeError: An existing imgbase was found with tags, but imgbase was called with --init. If this wasintentional, please untag the existing volumes and try again. Did you ensure to remove the existing VG from teh disk, i.e. delete everything which was on disk?
(In reply to Fabian Deutsch from comment #53) > From the logs: > > RuntimeError: An existing imgbase was found with tags, but imgbase was > called with --init. If this wasintentional, please untag the existing > volumes and try again. > > Did you ensure to remove the existing VG from teh disk, i.e. delete > everything which was on disk? After remove the existing VG on uefi machine, the issue from #c48 will gone, so please ignore #c48. but still met the original bug.
Ryan, I suppose this is because the install class patch was reverted, maybe we can change the title again to make sure EFi is working.
Created attachment 1170204 [details] uefi_all_log
(In reply to Fabian Deutsch from comment #55) > Ryan, > I suppose this is because the install class patch was reverted, maybe we can > change the title again to make sure EFi is working. The situation here is confusing, this this bug is being tested both against upstream and downstream. Yes, the patch was reverted downstream (though it's re-applied as of last week, and will be in the next engineering distill). However, it's present upstream. To verify this downstream, we can have jboggs distill a new beta 1 ISO, or wait for a distilled beta 2 image (or whatever included anaconda-21.48.22.56-5.el7ev). But this should be able to be verified right now upstream.
Test version: RHEV-H-7.2-20160621.4-RHVH-x86_64-dvd1.iso imgbased-0.7.0-0.1.el7ev.noarch redhat-release-rhev-hypervisor-4.0-0.7.el7.x86_64 Test machine: Dell R210 UEFI Test steps: 1. Boot from iSO. 2. Anaconda interactive install NGN 4.0 in UEFI mode. 3. Install NGN on local storage. 4. Reboot NGN 5. Login NGN in UEFI mode. Test result: After step3, no error occurred during the installation process. After step5, login NGN in UEFI mode can successful.
Test version: redhat-virtualization-host-4.0-20160714.3 imgbased-0.7.2-0.1.el7ev.noarch redhat-release-virtualization-host-4.0-0.20.el7.x86_64 Test steps: 1. Anaconda install RHVH 4.0 in UEFI machine. 2. Finish the installation with correct steps. 3. Reboot RHVH. 4. Boot RHVH on UEFI mode. Test result: RHVH can be bootable in UEFI machine. So the bug is fixed, change bug status to VERIFIED.