Bug 499854 - filedescriptor out of range in select()
Summary: filedescriptor out of range in select()
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: rawhide
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Joel Andres Granados
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 500607 501463 516041 (view as bug list)
Depends On:
Blocks: F12Alpha, F12AlphaBlocker
TreeView+ depends on / blocked
 
Reported: 2009-05-08 15:30 UTC by Clyde E. Kunkel
Modified: 2013-01-10 05:12 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-08-11 19:16:27 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
descriptor out of range (1.47 MB, text/plain)
2009-05-08 15:30 UTC, Clyde E. Kunkel
no flags Details
filedescriptor out of range in select (1.54 MB, text/plain)
2009-05-19 20:14 UTC, Clyde E. Kunkel
no flags Details
ValueError: filedescriptor out of range in select() (217.47 KB, text/plain)
2009-08-01 18:37 UTC, Clyde E. Kunkel
no flags Details
asnsaconda 12.7 tty4 screenshot (127.66 KB, image/jpeg)
2009-08-04 16:04 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/490/fd/ (56.08 KB, text/plain)
2009-08-04 18:36 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/514/fd/ (66 bytes, text/plain)
2009-08-04 18:37 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/464/fd/ (56.40 KB, text/plain)
2009-08-07 17:20 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/489/fd/ -- anac 12.10 (66 bytes, text/plain)
2009-08-07 17:21 UTC, Clyde E. Kunkel
no flags Details
anac 12.11 log from traceback (217.44 KB, text/plain)
2009-08-08 19:46 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/453/fd/ -- anac 12.11 (56.25 KB, text/plain)
2009-08-08 19:47 UTC, Clyde E. Kunkel
no flags Details
ls -l /proc/477/fd/ -- anac 12.11 (66 bytes, text/plain)
2009-08-08 19:47 UTC, Clyde E. Kunkel
no flags Details
updates image with iutil.py from latest git. (7.11 KB, application/octet-stream)
2009-08-10 08:46 UTC, Joel Andres Granados
no flags Details

Description Clyde E. Kunkel 2009-05-08 15:30:36 UTC
Created attachment 343121 [details]
descriptor out of range

Description of problem:
After selecting partitions to mount in custom partitioning (one for root, one for a current rawhide and one for /home, plus /boot on an ext3) anaconda began a filesystem check on the /home LV and then began checking ALL lvs even tho NOT mounted).  Appears it was too much since this is a test system with 20 some LVs and for this reason seems appropriate this be a low priority for the normal user and only high for me since I won't be able to test anaconda on this system.


Version-Release number of selected component (if applicable):
51

How reproducible:
Every time

Steps to Reproduce:
1. boot.iso, askmethod, http://download.fedora....etc
2. custom partitioning
3. select partitions to mount, click next
  
Actual results:
traceback during fscking non-mounted LVs

Expected results:
truck on and normal install.

Additional info:
Please do not fsck LVs or any partition that is not mounted!!  I have 1 2 set fstab for the LVs anyway during boot of my rawhide system so I know there were (or should not have been) any errors.

Comment 1 Clyde E. Kunkel 2009-05-08 18:46:01 UTC
Looks like only when you mount an LV during install.  Retried without mounting any LVs and got a normal install.

Comment 2 Chris Lumens 2009-05-13 13:16:23 UTC
*** Bug 500607 has been marked as a duplicate of this bug. ***

Comment 3 Clyde E. Kunkel 2009-05-14 22:42:16 UTC
Anaconda .52 is better.  Still checking LVs in VolGroup00 even tho NONE were mounted.  Did not check LVs in VGs 01, 02 (except root), and 03, nor any normal partitions.  Wonder why 00 got special treatment?

I understand checking prexisting partitions and LVs when mounting them in the installer, but do not understand why you would need to check those not mounted.

Comment 4 Chris Lumens 2009-05-19 13:55:09 UTC
*** Bug 501463 has been marked as a duplicate of this bug. ***

Comment 5 Chris Lumens 2009-05-19 15:05:41 UTC
So the original problem appears to be solved, but we're still going crazy with checking LVs that don't need to be?  That should be fixed in .53, we think.

Comment 6 Clyde E. Kunkel 2009-05-19 18:55:27 UTC
Install underway with .53 as I type and no LVs checked that didn't need to be.  Closing.

Comment 7 Clyde E. Kunkel 2009-05-19 20:13:50 UTC
Maybe I spoke too soon.  Near the end of the packages installation, the same trace back error occurred.  Since this occurred at a very different point in the installation should this be a new bz?  Anyway, attaching the dump and re-opening this one.

Comment 8 Clyde E. Kunkel 2009-05-19 20:14:56 UTC
Created attachment 344680 [details]
filedescriptor out of range in select

Comment 9 Clyde E. Kunkel 2009-05-19 23:13:06 UTC
Just tried again and same failure.  Noticed that at bottom of traceback there is a statement that all packages installed.  I am not sure what the next step would have been next, but there are no entries in /boot so maybe that is where things went awry.

This is definitely a show stopper bug.

Comment 10 Chris Lumens 2009-05-20 16:00:07 UTC
Is there any chance you could attach the output of ls -l /proc/<pid of anaconda>/fd/ to this bug report so we can see what file descriptors are still open at this point?  There's at least two anaconda processes.  You'll want to check them both.

Sounds like we're leaking file descriptors somewhere and running out of them for select to use.

Comment 11 Clyde E. Kunkel 2009-05-20 17:05:33 UTC
Testing now.  However, using a different mirror since download.fedora.etc is slow for me.

If the number of file descriptors include the mounts I specify, I had specified 9 mounts including / and /boot.  In previous versions of anaconda, Fedora 10 days, I have specified up to 16-17 mounts successfully.

Stand by for results.

Comment 12 Clyde E. Kunkel 2009-05-20 19:28:18 UTC
with anaconda .54, install completed successfully. :-/

FWIW I do have requested output of both anaconda pids/fd/ taken while install of packages was underway and the remaining pid when done.

Do you want them in view of no failure with .54 and failure with .53?

Note:  I use a very standard install test and have for several years.  I mount the same LVs and partitions in the same order. The only changes are i386 or x86_64 inatalls, mirrors, and a slightly different procedure when testing luks LVs.

Comment 13 Chris Lumens 2009-05-20 19:32:28 UTC
I think the data from the failed install with .53 would be better than from the successful install.  We've closed this bug a couple times now only to see it spring back up, so I'm not comfortable saying it's really fixed in .54.  That's especially the case because we didn't do anything in between the two versions that should affect this.

Comment 14 Clyde E. Kunkel 2009-05-20 23:08:37 UTC
I'll look for a mirror that still has .53.  Is there another way?

Comment 15 James Laska 2009-05-21 13:37:04 UTC
Clyde: You can find older rawhide images for download (from a slow link) at http://kojipkgs.fedoraproject.org/mash/

Comment 16 Clyde E. Kunkel 2009-05-21 19:20:44 UTC
No failure using the image from kojipkgs rawhide-20090519 .  The image used in comments 7 and 9 came from download.fedora.redhat.com and should have been identical.

Unfortuneately I had blown away the original .53 anaconda image used when the errors occurred and so can only assume there was a problem with the image with perhaps a corrupt routine that created the boot loader since I noticed that vmlinuz and initrd were on /boot, but nothing in /boot/grub.

I have also tried an image from a mirror that hasn't synced for a couple of days and no errors with that image. I did get /proc/<pid>/fd/ output during the last successful .53 test.  Is it of any interest?

So, since I cannot reproduce after two original failures, this bz should be closed as insufficient information or can't reproduce.

Sorry for all of the noise.  Is there a way to check boot.iso images with a sha1sum or similar?

Comment 17 Chris Lumens 2009-05-21 21:04:07 UTC
There's a dot file in the top level of the boot.iso that has a timestamp or build ID in it.  You can use that to compare if you would like.  I guess the /proc output is not all that interesting from a working install.

I'll close this one again, but I'm worried you'll be able to reproduce it again and we will have to open it right back up.

Comment 18 Clyde E. Kunkel 2009-08-01 18:37:24 UTC
Created attachment 355888 [details]
ValueError: filedescriptor out of range in select()

Getting this now with anaconda 12.7.  It occurs at the end of a long storage probe and there is no anacdump.txt file, thus the .log file is attached.

(FWIW, on this system, have not had a successful install since storage system rewrite.  FC 9 and 10 and 11 alpha installed.)

Comment 19 Clyde E. Kunkel 2009-08-03 03:26:08 UTC
I think this needs to be opened up again since it is now stopping me from installing rawhide on a test system.

My motivation ia this:

  1)  All fedora incarnations thru fedora 11 alpha installed, but none after that using the custom configuration alternative.
  2)  All software needs to be robust and if some limits are exceeded the software needs to at a minimum gracefully report the fact and allow alternatives to go forward even if those alternatives don't suit the user.

At a minimum, anaconda should allow a the usual installation choices and allow a custom configuration choice where whatever problem is present can be avoided with user decisions and actions.

I got fedora 11 to install by installing fedora 10 and then choosing the upgrade alternative with the install DVD.  Custom partitioning would not work.

Thanks for your consideration.

Comment 20 Andy Lindeberg 2009-08-04 14:37:22 UTC
Now that you've reproduced this, can you attach the output of ls -l /proc/<pid of
anaconda>/fd/ to this report? As before, you'll want to check both anaconda processes.

Comment 21 Clyde E. Kunkel 2009-08-04 16:04:45 UTC
Created attachment 356204 [details]
asnsaconda 12.7 tty4 screenshot

I used top to get the pids previously, but now top seems to be missing on the Fedora 12 Install Test DVD.  Is there another way to get the pids?

Anyway, attaching a screenshot from tty4 showing msgs that the two mds have an unknown partition table when in fact they are partitioned with ext4. md0 is a raid 10 device and md1 is a raid 5 device.  Neither are being used for anything--I prepped them to use as pvs for lv over raid / filesystems for test purposes.

Maybe today's rawhide image will provide top...I'll work on that.

Comment 22 Radek Vykydal 2009-08-04 16:27:31 UTC
(In reply to comment #21)

> 
> I used top to get the pids previously, but now top seems to be missing on the
> Fedora 12 Install Test DVD.  Is there another way to get the pids?
> 

You can use ps:

ps -C anaconda -o pid

Comment 23 Clyde E. Kunkel 2009-08-04 18:36:15 UTC
Created attachment 356226 [details]
ls -l /proc/490/fd/

Thanks for the pid method.

Process 490 attached.

Comment 24 Clyde E. Kunkel 2009-08-04 18:37:34 UTC
Created attachment 356227 [details]
ls -l /proc/514/fd/

Comment 25 Andy Lindeberg 2009-08-06 14:51:09 UTC
*** Bug 516041 has been marked as a duplicate of this bug. ***

Comment 26 Joel Andres Granados 2009-08-06 16:03:45 UTC
Should be solved in the next anaconda (12.9)

Comment 27 Clyde E. Kunkel 2009-08-07 14:52:30 UTC
Not solved in JKeating's 12.10.

Comment 28 Joel Andres Granados 2009-08-07 15:24:48 UTC
Can you post the anaconda dump/logs and the list of fds located in /proc/{ANACONDA_PID}/fd

Comment 29 Joel Andres Granados 2009-08-07 15:26:39 UTC
FYI, I am also testing with Jkeatings 12.10 at http://jkeating.fedorapeople.org/boot.iso

Comment 30 Joel Andres Granados 2009-08-07 16:01:38 UTC
Are you sure you are not hitting 516168?  You must have latest git for this to go past this bug.

Comment 31 Clyde E. Kunkel 2009-08-07 17:13:49 UTC
If JKeating boot.iso contains latest git, then I did not hit 516168.  I will rerun and get log....didn't before since error looked identical.

Quetion:  I ran the 12.10 boot.iso with askmethod and the mirror was VT as of 20090806.  Is this a prob?

Comment 32 Clyde E. Kunkel 2009-08-07 17:20:43 UTC
Created attachment 356687 [details]
ls -l /proc/464/fd/

I forgot:  sent the traceback automatically, see:
https://bugzilla.redhat.com/attachment.cgi?id=356663

And attaching the open fds which I, miraculously, somehow remembered to do!

Comment 33 Clyde E. Kunkel 2009-08-07 17:21:45 UTC
Created attachment 356688 [details]
ls -l /proc/489/fd/ -- anac 12.10

Comment 34 Clyde E. Kunkel 2009-08-07 17:30:56 UTC
Went back and looked at 516168...if this is also the x server failing, then I did hit it, but was given the option to continue with VNC or text install.  Chose text install this time and hit this bug at what looked like end of finding storage devices.

(FWIW, vnc works nicely also when X fails.)

Comment 35 Joel Andres Granados 2009-08-08 15:21:01 UTC
(In reply to comment #32)
> Created an attachment (id=356687) [details]
> ls -l /proc/464/fd/
> 
> I forgot:  sent the traceback automatically, see:
> https://bugzilla.redhat.com/attachment.cgi?id=356663

The issue with this traceback is that is shows that the code that is being executed does not have the fd patch.  Look at the line number where the traceback occurs in the iutil.py file

"""
  File "/usr/lib/anaconda/iutil.py", line 163, in execWithCapture
    (outStr, errStr) = proc.communicate()
"""

Thing is that in git that line is not in 163.  its in 182, where it was left after the fd patch.  I'm not sure how this is happening.  Pls test with current anaconda 12.11.

Thx for testing :)

Comment 36 Clyde E. Kunkel 2009-08-08 16:48:41 UTC
OK, glad to test.  Where can I get a boot.iso with 12.11?

Comment 37 Ian Pilcher 2009-08-08 17:28:43 UTC
(In reply to comment #36)
> OK, glad to test.  Where can I get a boot.iso with 12.11?  

Same question here.  This bug prevented me from installing Fedora 11, and it's
now preventing me from installing Fedora 12 Alpha.  Would eagerly test an
updated anaconda.

Comment 38 Ian Pilcher 2009-08-08 17:56:16 UTC
(In reply to comment #37)
> Same question here.  This bug prevented me from installing Fedora 11, and it's
> now preventing me from installing Fedora 12 Alpha.  Would eagerly test an
> updated anaconda.  

http://jkeating.fedorapeople.org/boot.iso appears to have anaconda 12.11.

Comment 39 Clyde E. Kunkel 2009-08-08 19:46:08 UTC
Created attachment 356785 [details]
anac 12.11 log from traceback

Failed same way in 12.11.  Appears to be at or near end of locating storage devices or ready for next screen.

Comment 40 Clyde E. Kunkel 2009-08-08 19:47:05 UTC
Created attachment 356786 [details]
ls -l /proc/453/fd/ -- anac 12.11

Comment 41 Clyde E. Kunkel 2009-08-08 19:47:42 UTC
Created attachment 356787 [details]
ls -l /proc/477/fd/ -- anac 12.11

Comment 42 Ian Pilcher 2009-08-08 20:11:11 UTC
In my case, I never got a traceback from anaconda, but it just hung after a
long time in the "Finding storage devices" stage.  From the limited amount I
could glean, it seems that mdadm --assemble may have hung.

Comment 43 Joel Andres Granados 2009-08-10 08:46:11 UTC
Created attachment 356861 [details]
updates image with iutil.py from latest git.

(In reply to comment #39)
> Created an attachment (id=356785) [details]
> anac 12.11 log from traceback
> 
> Failed same way in 12.11.  Appears to be at or near end of locating storage
> devices or ready for next screen.  

This traceback still has the same issue as before.  I'm investigating the origin of this problem.  For the time being, can you test with the attached image?  If you still get a traceback, can you post it.  (only the traceback is necessary)

Comment 44 Joel Andres Granados 2009-08-10 09:30:33 UTC
I think your problem is this:

<attachment (id=356785)>
.
.
.
'--stage2', 'http://kojipkgs.fedoraproject.org/mash/rawhide/x86_64/os/images/install.img'
.
.
.
</attachment (id=356785)>

This means that you are using stage1 from boot.img but you are using stage2 from that URL, which has anaconda anaconda-12.7-1.fc12.x86_64.rpm, which does not have the file descriptor fix.  Pls use the stage2 image from the iso image (That is what I did for my tests).  You should be ok by just choosing one of the default targets from the grub menu.

The reason that the version of anaconda in the traceback is not 12.7 is because it it stage1 that prints the version, not stage2 (Something to think about now that we are handling stage2 images....)

Again, thx for testing  :)

Comment 45 Clyde E. Kunkel 2009-08-10 13:18:26 UTC
Fixed except now 516557.

This has been a real learning experience.

Where, if you can't use askmethod, would I have been prompted for a source of packages?  Or, is there a different parameter to give vice askmethod?  Or, was this a special boot.iso just for testing purposes?

Thanks for your patience and bring it on, I will test anything you throw my way.

Comment 46 Joel Andres Granados 2009-08-11 08:28:08 UTC
I'm changing to modified based on comment 45.

Clyde:
Try to pass the source of packages with the method=URL argument.  not askmethod.

Comment 47 Jesse Keating 2009-08-11 18:22:51 UTC
I'm a bit confused.  Where was the modification that fixed this issue, and is there still a newer build of anaconda needed to fix it?  Clyde, could you try with 20090811 rawhide, without any updates.img to see if it works?  If it does, I think we can go from MODIFIED to closed here.

Comment 48 Clyde E. Kunkel 2009-08-11 19:16:27 UTC
Yes...close (I'll do it).  I confirmed in comment 45 where I skipped using askmethod which was changing the environment by mixing two versions of anaconda.

Tested with rawhide 20090811 this am and this bug is fixed.

(Several others tho....so we will continue to have fun :-)  )


Note You need to log in before you can comment on or make changes to this bug.