590640 – SystemError: (22, 'Invalid argument')

Bug 590640 - SystemError: (22, 'Invalid argument')

Summary: SystemError: (22, 'Invalid argument')

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	anaconda
Sub Component:
Version:	14
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Anaconda Maintenance Team
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	anaconda_trace_hash:4e1410999d178390e...
Depends On:
Blocks:	F13Blocker, F13FinalBlocker
TreeView+	depends on / blocked

Reported:	2010-05-10 11:49 UTC by Kamil Páral
Modified:	2013-01-10 05:57 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2010-08-04 14:15:39 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Attached traceback automatically from anaconda. (334.05 KB, text/plain) 2010-05-10 11:49 UTC, Kamil Páral	no flags	Details
filelist from under /mnt/source (1015.91 KB, text/plain) 2010-05-10 15:34 UTC, Kamil Páral	no flags	Details
Attached traceback automatically from anaconda. (503.22 KB, text/plain) 2010-05-10 19:16 UTC, James Laska	no flags	Details
Attached traceback automatically from anaconda. (352.46 KB, text/plain) 2010-05-11 03:28 UTC, He Rui	no flags	Details
Attached traceback automatically from anaconda. (346.72 KB, text/plain) 2010-05-12 02:52 UTC, Andre Robatino	no flags	Details
Attached traceback automatically from anaconda. (290.11 KB, text/plain) 2010-05-12 03:05 UTC, Andre Robatino	no flags	Details
Attached traceback automatically from anaconda. (397.89 KB, text/plain) 2010-05-12 12:04 UTC, James Laska	no flags	Details
View All

Description Kamil Páral 2010-05-10 11:49:39 UTC

The following was filed automatically by anaconda:
anaconda 13.41 exception report
Traceback (most recent call first):
  File "/usr/lib/anaconda/isys.py", line 95, in lochangefd
    _isys.lochangefd(loop, targ)
  File "/usr/lib/anaconda/backend.py", line 190, in mountInstallImage
    isys.lochangefd("/dev/loop0", self._loopbackFile)
  File "/usr/lib/anaconda/yuminstall.py", line 904, in run
    if self.anaconda.backend.mountInstallImage(self.anaconda, stage2img):
  File "/usr/lib/anaconda/yuminstall.py", line 1703, in doInstall
    rc = self.ayum.run(self.instLog, cb, anaconda.intf, anaconda.id)
  File "/usr/lib/anaconda/backend.py", line 299, in doInstall
    return anaconda.backend.doInstall(anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 205, in moveStep
    rc = stepFunc(self.anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 126, in gotoNext
    self.moveStep()
  File "/usr/lib/anaconda/gui.py", line 1313, in nextClicked
    self.anaconda.dispatch.gotoNext()
  File "/usr/lib/anaconda/iw/progress_gui.py", line 79, in renderCallback
    self.intf.icw.nextClicked()
  File "/usr/lib/anaconda/gui.py", line 1334, in handleRenderCallback
    self.currentWindow.renderCallback()
SystemError: (22, 'Invalid argument')

Comment 1 Kamil Páral 2010-05-10 11:49:44 UTC

Created attachment 412807 [details]
Attached traceback automatically from anaconda.

Comment 2 Kamil Páral 2010-05-10 11:55:07 UTC

I was doing this test case:
https://fedoraproject.org/wiki/QA/TestCases/InstallSourceNfs

That means installing from NFS. The exception happened after "Transferring install image to hard drive" dialog.

Comment 3 Kamil Páral 2010-05-10 12:43:28 UTC

Tried two more times, always the same error.

Comment 4 Kamil Páral 2010-05-10 13:21:43 UTC

I tried to mount the contents of the DVD and export it via NFS and then the installation worked. Maybe it can be an issue with the NFS mirror I used before? (local BRQ RH mirror)

Comment 5 Chris Lumens 2010-05-10 14:50:25 UTC

let's get some obvious stuff out of the way first.  Does /mnt/source/rhinstall-install.img even exist?

Comment 6 Kamil Páral 2010-05-10 15:34:07 UTC

Created attachment 412877 [details]
filelist from under /mnt/source

There is no /mnt/source/rhinstall-install.img. But that file isn't available even on mounted DVD and exported over NFS. I attach the full filelist of files in our local BRQ mirror mounted under /mnt/source.

Comment 7 Kamil Páral 2010-05-10 16:35:07 UTC

/mnt/sysimage/rhinstall-install.img of course exists, 142MB.

Comment 8 James Laska 2010-05-10 19:16:05 UTC

Created attachment 412939 [details]
Attached traceback automatically from anaconda.

Comment 9 James Laska 2010-05-10 19:17:58 UTC

Dlehman inspected the traceback for a betteer idea on how to reproduce this.  Dave recommended 
 1) Boot the boot.iso (or netinst.iso)
 2) Add a boot parameter: repo=nfs:<server>:<path>

Doing this procedure, I am able to reproduce this failure.

Comment 10 Jesse Keating 2010-05-10 19:32:14 UTC

I haven't seen it then because in my NFS test cases, the boot images are delivered via pxe, which I think is the more typical case when dealing with NFS based install.

Comment 11 He Rui 2010-05-11 03:28:08 UTC

Created attachment 413033 [details]
Attached traceback automatically from anaconda.

Comment 12 He Rui 2010-05-11 03:38:36 UTC

Verified Comment 4 and Comment 9. It happens when I use the NFS of local mirror. But I think the common case of NFS based install is to mount the contents of the DVD and export it via NFS, and it works in that case.

Comment 13 James Laska 2010-05-11 19:03:26 UTC

I thought I knew how to reproduce this failure (boot physical media and do method/repo=nfs:server:path).  But in testing this and potential workarounds, I'm not longer able to trigger the failure.

Will continue testing ...

Comment 14 John Poelstra 2010-05-12 00:44:10 UTC

Reviewed by a QA, Release Engineering, etc. at 2010-05-11 "Go/No-Go Meeting. After lengthy discussion this bug should continue to be a blocker.

Comment 15 James Laska 2010-05-12 01:29:50 UTC

kparal or rhe, I'll continue to try to isolate the exact failure conditions in this bug, but any help you both can offer to pinpoint the root cause would be appreciated.  Thanks!

dlehman: Any suspect areas from the posted anacdump files?

Comment 16 Andre Robatino 2010-05-12 02:52:40 UTC

Created attachment 413315 [details]
Attached traceback automatically from anaconda.

Comment 17 Andre Robatino 2010-05-12 02:57:16 UTC

Above traceback comes from attempting NFS install on a VirtualBox guest, using the host as the NFS server.  I loop-mounted the DVD on the host, attached boot.iso to the guest, and passed repo=nfs:<server>:<path> when booting it.  I first did the same thing without any problems installing 32-bit, then it failed with the above traceback on 64-bit.

Comment 18 Andre Robatino 2010-05-12 03:05:43 UTC

Created attachment 413316 [details]
Attached traceback automatically from anaconda.

Comment 19 Andre Robatino 2010-05-12 03:17:09 UTC

Went through exactly the same procedure - only difference was that the guest had a different IP address this time - and got the above traceback again.  Seems fairly reproducible.

P.S. Just realized I made a mistake - I had the i386 DVD loop-mounted, but was booting the x86_64 netinst image.  Interesting I get the same traceback though the repo content is wrong.  Trying again with the x86_64 DVD loop-mounted and booting from the x86_64 netinst image.

Comment 20 Andre Robatino 2010-05-12 03:28:00 UTC

Not sure what's going on.  I'm getting the error "Some of the packages you have selected for install are missing dependencies."  on two successive install attempts.  Will quit for now.

Comment 21 He Rui 2010-05-12 07:24:58 UTC

One workaround is to use 'askmethod' as kernel parameter in grub, then specify nfs url as install source in stage 1. It works well without this issue happening.

By using 'repo=' parameter, I compared F13 nfs install with Http install and F12 nfs install, and found out that http install and F12 nfs install don't require "Transferring install image to hard drive" step. They begin "starting installation Process" after checking dependencies, and the steps in anaconda.log and files under /mnt/sysimage are similar. So I don't know why F13 nfs install uses a different way and /mnt/sysimage/rhinstall-install.img must exist.

Comment 22 Kamil Páral 2010-05-12 08:15:28 UTC

(In reply to comment #21)
> By using 'repo=' parameter, I compared F13 nfs install with Http install and
> F12 nfs install, and found out that http install and F12 nfs install don't
> require "Transferring install image to hard drive" step. They begin "starting
> installation Process" after checking dependencies, and the steps in
> anaconda.log and files under /mnt/sysimage are similar. 

I can confirm this and I already mentioned it to clumens. NFS install shows "Transferring install image to hard drive" (and fails), HTTP install doesn't show that dialog (and continues).

Comment 23 Kamil Páral 2010-05-12 08:29:55 UTC

In contrast to all the people here, I am able to get the traceback only with our local BRQ mirror (nfs.englab.brq.redhat.com). When mounting DVD image and exporting it over NFS from localhost, the install process proceeds fine (even the "Transferring install image to hard drive" dialog goes ok).

Using "askmethod" instead of "method=nfs:..." seems to be a workaround, the install doesn't fail for me in that case.

Comment 24 He Rui 2010-05-12 08:45:28 UTC

(In reply to comment #22)
> (In reply to comment #21)
> > By using 'repo=' parameter, I compared F13 nfs install with Http install and
> > F12 nfs install, and found out that http install and F12 nfs install don't
> > require "Transferring install image to hard drive" step. They begin "starting
> > installation Process" after checking dependencies, and the steps in
> > anaconda.log and files under /mnt/sysimage are similar. 
> 
> I can confirm this and I already mentioned it to clumens. NFS install shows
> "Transferring install image to hard drive" (and fails), HTTP install doesn't
> show that dialog (and continues).    

Yeah, yeah. And it seems that even if there's no "Transferring install image to hard drive" step shown, /mnt/sysimage can still get the files the same as the ones in /mnt/sysimage/rhinstall-install.img.

Comment 25 James Laska 2010-05-12 12:04:26 UTC

Created attachment 413407 [details]
Attached traceback automatically from anaconda.

Comment 26 James Laska 2010-05-12 13:00:55 UTC

It seems that the differing boot and repo arch may easily trigger this failure.  

However, from the attached tracebacks, even a matching boot and repo arch seems to trigger the failure.

=== attachment#412807 [details] ===
Boot arch = x86_64
Repo arch = x86_64

  11:43:33,144 INFO loader: anaconda version 13.41 on x86_64 starting
  11:43:33,143 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" method=nfs:nfs.englab.brq.redhat.com:/pub/fedora/linux/development/13/x86_64/os/ BOOT_IMAGE=vmlinuz 

=== attachment#412939 [details] ===
Boot arch = i386
Repo arch = x86_64

  19:06:37,161 INFO loader: anaconda version 13.41 on i386 starting
  19:06:37,161 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" xdriver=vesa nomodeset repo=nfs:dell-t5400.test.redhat.com:/var/www/cobbler/ks_mirror/F-13-RC2-x86_64 BOOT_IMAGE=vmlinuz 

=== attachment#413033 [details] ===
Boot arch = x86_64
Repo arch = x86_64

  03:26:05,018 INFO loader: anaconda version 13.41 on x86_64 starting
  03:26:05,017 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:nfs.englab.nay.redhat.com:/pub/fedora/linux/development/13/x86_64/os BOOT_IMAGE=vmlinuz 

=== attachment#413315 [details] ===
Boot arch = x86_64
Repo arch = UNKNOWN
  
  22:43:44,902 INFO loader: anaconda version 13.41 on x86_64 starting
  22:43:44,901 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:compaq-pc:/tmp/foo/ BOOT_IMAGE=vmlinuz 

=== attachment#413316 [details] ===
Boot arch = x86_64
Repo arch = UNKNOWN

  22:59:53,870 INFO loader: anaconda version 13.41 on x86_64 starting
  22:59:53,869 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" repo=nfs:compaq-pc:/tmp/foo/ BOOT_IMAGE=vmlinuz 

=== attachment#413407 [details] ===
Boot arch = i386
Repo arch = x86_64

  11:55:19,149 INFO loader: anaconda version 13.41 on i386 starting
  11:55:19,148 INFO loader: kernel command line: initrd=initrd.img stage2=hd:LABEL="Fedora" xdriver=vesa nomodeset repo=nfs:dell-t5400.test.redhat.com:/var/www/cobbler/ks_mirror/F-13-RC2-x86_64 BOOT_IMAGE=vmlinuz

Comment 27 Kamil Páral 2010-05-12 13:29:34 UTC

I have tested older installers. F13 Beta netinst (anaconda 13.37.2) crashes too. F13 Alpha netinst (anaconda 13.32) doesn't crash (and there is no "Transferring install image to hard drive" dialog).

Comment 28 Andre Robatino 2010-05-12 14:23:01 UTC

Started over, with the 64-bit DVD mounted, booting from the 64-bit netinst image, passing "repo=nfs:<server>:<path>", and it works this time (starts installing packages).  So the only way I can reproduce it is by using the wrong repo.

Comment 29 Kamil Páral 2010-05-12 14:40:46 UTC

I have used http://kojipkgs.fedoraproject.org/mash/ to narrow down when this issue appeared. It's funny, it happened on the Fool's Day exactly. I can install from boot-20100331.iso just fine, but it crashes from boot-20100401.iso. And guess what - 0331 has no "Transferring install image to hard drive" dialog, 0401 has it.

Comment 30 Chris Lumens 2010-05-12 15:22:01 UTC

11:14 < clumens> jlaska: compare install.img on the CD to /mnt/sysimage/rhinstall-install.img
11:14 < clumens> kparal: that was for you
11:15 < kparal> clumens: so extract boot.iso, take install.img, and compare it to 
                rhinstall-install.img copied out of the VM, right?
11:15 < clumens> yeah
11:16 < dlehman> or just mount boot.iso inside the vm and compare there
11:16 < clumens> if they're not the same, that's one reason why this could asplode
11:16 < clumens> in the meantime, i'll go read loop_change_fd and see why else it might give EINVAL
11:16 < clumens> i can't help but think i've been here before.
11:17 < clumens> kparal: you can also check what losetup /dev/loop0 says
11:17 < kparal> -rw-r--r--. 1 root root 144928768 2010-04-01 18:27 install.img
11:17 < kparal> -rw-r--r--. 1 root root 148127744 2010-05-12 17:11 rhinstall-install.img
11:17 -!- Kyril [~Kyril.bell.ca] has joined #fedora-qa
11:18 < clumens> no more callers, please, we have a winner.
11:18 < kparal> clumens: losetup printed me usage line (?)
11:18 < clumens> well that doesn't matter.  the fact that they're different sizes is the reason.
11:18 < clumens>  706        error = -EINVAL;
11:19 < clumens>  711        /* size of the new backing store needs to be the same */
11:19 < clumens>  712        if (get_loop_size(lo, file) != get_loop_size(lo, old_file))
11:19 < clumens>  713                goto out_putf;
11:19 < dlehman> and this didn't used to matter because we never bothered with the install.img from 
                 the repo if we already had one

Comment 31 David Lehman 2010-05-12 16:27:03 UTC

So here's the recipe to hit this bug: You must boot from boot.iso and specify an NFS repo whose images/install.img is not the same file as the images/install.img on boot.iso. So don't do that.

Comment 32 James Laska 2010-05-12 16:37:21 UTC

(In reply to comment #31)
> So here's the recipe to hit this bug: You must boot from boot.iso and specify
> an NFS repo whose images/install.img is not the same file as the
> images/install.img on boot.iso. So don't do that.    

Sweet, nice job nailing the failure case folks!

Comment 33 Chris Lumens 2010-05-13 14:47:36 UTC

Right, so we've decided this is a big WONTFIX?  Please don't mix and match sources for boot.iso and NFS repo to avoid this.

Comment 34 Kamil Páral 2010-05-13 15:12:33 UTC

I don't object against the WONTFIX, but:

1. This decision should be visibly documented somewhere, because I (as the end-user) don't really know anything about implementation stuff like install.img inside boot.iso and inside repo.
2. There should be a check for this issue in the source code. You don't have to fix it or work around it, but you should at least display a dialog "Your boot.iso does not match the version of your NFS repository. Please... <do something>." and not just display traceback. Otherwise you will see this reported again and again.

Comment 35 James Laska 2010-05-13 15:30:34 UTC

(In reply to comment #34)
> I don't object against the WONTFIX, but:
> 
> 1. This decision should be visibly documented somewhere, because I (as the
> end-user) don't really know anything about implementation stuff like
> install.img inside boot.iso and inside repo.

I just marked this bug for inclusion to the CommonBugs page.  Adam Williamson or I will walk that queue in a couple of days to document this issue.  Of course, anyone is welcome to document this directly in the meantime (see http://fedoraproject.org/wiki/Common_F13_bugs#My_bug_is_not_listed).

Comment 36 He Rui 2010-08-04 10:29:50 UTC

Reproduced it in F14-Alpha-TC2 since the install.img in branched directory is different from that in boot.iso.

Comment 37 Chris Lumens 2010-08-04 14:15:39 UTC

I believe comment #31 and #33 still apply, though.

Comment 38 James Laska 2010-08-04 14:39:33 UTC

Hurry, Chris is right, while confusing, this problem seems to only manifest during testing of Fedora before it is GA'd.  Comment#31 explains it well.  Basically, when testing snapshots, unless otherwise noted, we'll need to test using content provided only by that snapshot (includes install.img, vmlinuz, initrd.img and repodata).

Comment 39 Kamil Páral 2010-08-05 07:49:50 UTC

I still believe my comment #34 still applies as well. Unless the traceback is fixed and some meaningful error message is displayed, this bug will be reported over and over again. Maybe not by us, because we already know, but by other people. And the traceback is so cryptic that it will steal some brain cycles and time even from members from QA team. It will be necessary to search for this bug and compare the traceback to know whether it should be reported or not.

Comment 40 James Laska 2010-08-05 12:20:20 UTC

clumens: will this problem magically go away with your proposed redesign of the stage2 install.img handling?

Comment 41 Chris Lumens 2010-08-05 13:24:58 UTC

It should.  I haven't gotten nearly this far into the work yet, but I don't think we'll even need to do the lochangefd anymore.

Note You need to log in before you can comment on or make changes to this bug.