The following was filed automatically by anaconda: anaconda 13.41 exception report Traceback (most recent call first): File "/usr/lib/anaconda/isys.py", line 95, in lochangefd _isys.lochangefd(loop, targ) File "/usr/lib/anaconda/backend.py", line 190, in mountInstallImage isys.lochangefd("/dev/loop0", self._loopbackFile) File "/usr/lib/anaconda/yuminstall.py", line 904, in run if self.anaconda.backend.mountInstallImage(self.anaconda, stage2img): File "/usr/lib/anaconda/yuminstall.py", line 1703, in doInstall rc = self.ayum.run(self.instLog, cb, anaconda.intf, anaconda.id) File "/usr/lib/anaconda/backend.py", line 299, in doInstall return anaconda.backend.doInstall(anaconda) File "/usr/lib/anaconda/dispatch.py", line 205, in moveStep rc = stepFunc(self.anaconda) File "/usr/lib/anaconda/dispatch.py", line 126, in gotoNext self.moveStep() File "/usr/lib/anaconda/gui.py", line 1313, in nextClicked self.anaconda.dispatch.gotoNext() File "/usr/lib/anaconda/iw/progress_gui.py", line 79, in renderCallback self.intf.icw.nextClicked() File "/usr/lib/anaconda/gui.py", line 1334, in handleRenderCallback self.currentWindow.renderCallback() SystemError: (22, 'Invalid argument')
Created attachment 412807 [details] Attached traceback automatically from anaconda.
I was doing this test case: https://fedoraproject.org/wiki/QA/TestCases/InstallSourceNfs That means installing from NFS. The exception happened after "Transferring install image to hard drive" dialog.
Tried two more times, always the same error.
I tried to mount the contents of the DVD and export it via NFS and then the installation worked. Maybe it can be an issue with the NFS mirror I used before? (local BRQ RH mirror)
let's get some obvious stuff out of the way first. Does /mnt/source/rhinstall-install.img even exist?
Created attachment 412877 [details] filelist from under /mnt/source There is no /mnt/source/rhinstall-install.img. But that file isn't available even on mounted DVD and exported over NFS. I attach the full filelist of files in our local BRQ mirror mounted under /mnt/source.
/mnt/sysimage/rhinstall-install.img of course exists, 142MB.
Created attachment 412939 [details] Attached traceback automatically from anaconda.
Dlehman inspected the traceback for a betteer idea on how to reproduce this. Dave recommended 1) Boot the boot.iso (or netinst.iso) 2) Add a boot parameter: repo=nfs:<server>:<path> Doing this procedure, I am able to reproduce this failure.
I haven't seen it then because in my NFS test cases, the boot images are delivered via pxe, which I think is the more typical case when dealing with NFS based install.
Created attachment 413033 [details] Attached traceback automatically from anaconda.
Verified Comment 4 and Comment 9. It happens when I use the NFS of local mirror. But I think the common case of NFS based install is to mount the contents of the DVD and export it via NFS, and it works in that case.
I thought I knew how to reproduce this failure (boot physical media and do method/repo=nfs:server:path). But in testing this and potential workarounds, I'm not longer able to trigger the failure. Will continue testing ...
Reviewed by a QA, Release Engineering, etc. at 2010-05-11 "Go/No-Go Meeting. After lengthy discussion this bug should continue to be a blocker.
kparal or rhe, I'll continue to try to isolate the exact failure conditions in this bug, but any help you both can offer to pinpoint the root cause would be appreciated. Thanks! dlehman: Any suspect areas from the posted anacdump files?
Created attachment 413315 [details] Attached traceback automatically from anaconda.
Above traceback comes from attempting NFS install on a VirtualBox guest, using the host as the NFS server. I loop-mounted the DVD on the host, attached boot.iso to the guest, and passed repo=nfs:<server>:<path> when booting it. I first did the same thing without any problems installing 32-bit, then it failed with the above traceback on 64-bit.
Created attachment 413316 [details] Attached traceback automatically from anaconda.
Went through exactly the same procedure - only difference was that the guest had a different IP address this time - and got the above traceback again. Seems fairly reproducible. P.S. Just realized I made a mistake - I had the i386 DVD loop-mounted, but was booting the x86_64 netinst image. Interesting I get the same traceback though the repo content is wrong. Trying again with the x86_64 DVD loop-mounted and booting from the x86_64 netinst image.
Not sure what's going on. I'm getting the error "Some of the packages you have selected for install are missing dependencies." on two successive install attempts. Will quit for now.
One workaround is to use 'askmethod' as kernel parameter in grub, then specify nfs url as install source in stage 1. It works well without this issue happening. By using 'repo=' parameter, I compared F13 nfs install with Http install and F12 nfs install, and found out that http install and F12 nfs install don't require "Transferring install image to hard drive" step. They begin "starting installation Process" after checking dependencies, and the steps in anaconda.log and files under /mnt/sysimage are similar. So I don't know why F13 nfs install uses a different way and /mnt/sysimage/rhinstall-install.img must exist.
(In reply to comment #21) > By using 'repo=' parameter, I compared F13 nfs install with Http install and > F12 nfs install, and found out that http install and F12 nfs install don't > require "Transferring install image to hard drive" step. They begin "starting > installation Process" after checking dependencies, and the steps in > anaconda.log and files under /mnt/sysimage are similar. I can confirm this and I already mentioned it to clumens. NFS install shows "Transferring install image to hard drive" (and fails), HTTP install doesn't show that dialog (and continues).
In contrast to all the people here, I am able to get the traceback only with our local BRQ mirror (nfs.englab.brq.redhat.com). When mounting DVD image and exporting it over NFS from localhost, the install process proceeds fine (even the "Transferring install image to hard drive" dialog goes ok). Using "askmethod" instead of "method=nfs:..." seems to be a workaround, the install doesn't fail for me in that case.
(In reply to comment #22) > (In reply to comment #21) > > By using 'repo=' parameter, I compared F13 nfs install with Http install and > > F12 nfs install, and found out that http install and F12 nfs install don't > > require "Transferring install image to hard drive" step. They begin "starting > > installation Process" after checking dependencies, and the steps in > > anaconda.log and files under /mnt/sysimage are similar. > > I can confirm this and I already mentioned it to clumens. NFS install shows > "Transferring install image to hard drive" (and fails), HTTP install doesn't > show that dialog (and continues). Yeah, yeah. And it seems that even if there's no "Transferring install image to hard drive" step shown, /mnt/sysimage can still get the files the same as the ones in /mnt/sysimage/rhinstall-install.img.
Created attachment 413407 [details] Attached traceback automatically from anaconda.
It seems that the differing boot and repo arch may easily trigger this failure. However, from the attached tracebacks, even a matching boot and repo arch seems to trigger the failure. === attachment#412807 [details] === Boot arch = x86_64 Repo arch = x86_64 11:43:33,144 INFO loader: anaconda version 13.41 on x86_64 starting 11:43:33,143 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" method=nfs:nfs.englab.brq.redhat.com:/pub/fedora/linux/development/13/x86_64/os/ BOOT_IMAGE=vmlinuz === attachment#412939 [details] === Boot arch = i386 Repo arch = x86_64 19:06:37,161 INFO loader: anaconda version 13.41 on i386 starting 19:06:37,161 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" xdriver=vesa nomodeset repo=nfs:dell-t5400.test.redhat.com:/var/www/cobbler/ks_mirror/F-13-RC2-x86_64 BOOT_IMAGE=vmlinuz === attachment#413033 [details] === Boot arch = x86_64 Repo arch = x86_64 03:26:05,018 INFO loader: anaconda version 13.41 on x86_64 starting 03:26:05,017 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:nfs.englab.nay.redhat.com:/pub/fedora/linux/development/13/x86_64/os BOOT_IMAGE=vmlinuz === attachment#413315 [details] === Boot arch = x86_64 Repo arch = UNKNOWN 22:43:44,902 INFO loader: anaconda version 13.41 on x86_64 starting 22:43:44,901 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:compaq-pc:/tmp/foo/ BOOT_IMAGE=vmlinuz === attachment#413316 [details] === Boot arch = x86_64 Repo arch = UNKNOWN 22:59:53,870 INFO loader: anaconda version 13.41 on x86_64 starting 22:59:53,869 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:compaq-pc:/tmp/foo/ BOOT_IMAGE=vmlinuz === attachment#413407 [details] === Boot arch = i386 Repo arch = x86_64 11:55:19,149 INFO loader: anaconda version 13.41 on i386 starting 11:55:19,148 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" xdriver=vesa nomodeset repo=nfs:dell-t5400.test.redhat.com:/var/www/cobbler/ks_mirror/F-13-RC2-x86_64 BOOT_IMAGE=vmlinuz
I have tested older installers. F13 Beta netinst (anaconda 13.37.2) crashes too. F13 Alpha netinst (anaconda 13.32) doesn't crash (and there is no "Transferring install image to hard drive" dialog).
Started over, with the 64-bit DVD mounted, booting from the 64-bit netinst image, passing "repo=nfs:<server>:<path>", and it works this time (starts installing packages). So the only way I can reproduce it is by using the wrong repo.
I have used http://kojipkgs.fedoraproject.org/mash/ to narrow down when this issue appeared. It's funny, it happened on the Fool's Day exactly. I can install from boot-20100331.iso just fine, but it crashes from boot-20100401.iso. And guess what - 0331 has no "Transferring install image to hard drive" dialog, 0401 has it.
11:14 < clumens> jlaska: compare install.img on the CD to /mnt/sysimage/rhinstall-install.img 11:14 < clumens> kparal: that was for you 11:15 < kparal> clumens: so extract boot.iso, take install.img, and compare it to rhinstall-install.img copied out of the VM, right? 11:15 < clumens> yeah 11:16 < dlehman> or just mount boot.iso inside the vm and compare there 11:16 < clumens> if they're not the same, that's one reason why this could asplode 11:16 < clumens> in the meantime, i'll go read loop_change_fd and see why else it might give EINVAL 11:16 < clumens> i can't help but think i've been here before. 11:17 < clumens> kparal: you can also check what losetup /dev/loop0 says 11:17 < kparal> -rw-r--r--. 1 root root 144928768 2010-04-01 18:27 install.img 11:17 < kparal> -rw-r--r--. 1 root root 148127744 2010-05-12 17:11 rhinstall-install.img 11:17 -!- Kyril [~Kyril.bell.ca] has joined #fedora-qa 11:18 < clumens> no more callers, please, we have a winner. 11:18 < kparal> clumens: losetup printed me usage line (?) 11:18 < clumens> well that doesn't matter. the fact that they're different sizes is the reason. 11:18 < clumens> 706 error = -EINVAL; 11:19 < clumens> 711 /* size of the new backing store needs to be the same */ 11:19 < clumens> 712 if (get_loop_size(lo, file) != get_loop_size(lo, old_file)) 11:19 < clumens> 713 goto out_putf; 11:19 < dlehman> and this didn't used to matter because we never bothered with the install.img from the repo if we already had one
So here's the recipe to hit this bug: You must boot from boot.iso and specify an NFS repo whose images/install.img is not the same file as the images/install.img on boot.iso. So don't do that.
(In reply to comment #31) > So here's the recipe to hit this bug: You must boot from boot.iso and specify > an NFS repo whose images/install.img is not the same file as the > images/install.img on boot.iso. So don't do that. Sweet, nice job nailing the failure case folks!
Right, so we've decided this is a big WONTFIX? Please don't mix and match sources for boot.iso and NFS repo to avoid this.
I don't object against the WONTFIX, but: 1. This decision should be visibly documented somewhere, because I (as the end-user) don't really know anything about implementation stuff like install.img inside boot.iso and inside repo. 2. There should be a check for this issue in the source code. You don't have to fix it or work around it, but you should at least display a dialog "Your boot.iso does not match the version of your NFS repository. Please... <do something>." and not just display traceback. Otherwise you will see this reported again and again.
(In reply to comment #34) > I don't object against the WONTFIX, but: > > 1. This decision should be visibly documented somewhere, because I (as the > end-user) don't really know anything about implementation stuff like > install.img inside boot.iso and inside repo. I just marked this bug for inclusion to the CommonBugs page. Adam Williamson or I will walk that queue in a couple of days to document this issue. Of course, anyone is welcome to document this directly in the meantime (see http://fedoraproject.org/wiki/Common_F13_bugs#My_bug_is_not_listed).
Reproduced it in F14-Alpha-TC2 since the install.img in branched directory is different from that in boot.iso.
I believe comment #31 and #33 still apply, though.
Hurry, Chris is right, while confusing, this problem seems to only manifest during testing of Fedora before it is GA'd. Comment#31 explains it well. Basically, when testing snapshots, unless otherwise noted, we'll need to test using content provided only by that snapshot (includes install.img, vmlinuz, initrd.img and repodata).
I still believe my comment #34 still applies as well. Unless the traceback is fixed and some meaningful error message is displayed, this bug will be reported over and over again. Maybe not by us, because we already know, but by other people. And the traceback is so cryptic that it will steal some brain cycles and time even from members from QA team. It will be necessary to search for this bug and compare the traceback to know whether it should be reported or not.
clumens: will this problem magically go away with your proposed redesign of the stage2 install.img handling?
It should. I haven't gotten nearly this far into the work yet, but I don't think we'll even need to do the lochangefd anymore.