Bug 176516 - rawhide anaconda crash
Summary: rawhide anaconda crash
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: David Cantrell
QA Contact: Mike McLean
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-12-23 23:36 UTC by Andy Burns
Modified: 2007-11-30 22:11 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-01-31 20:48:58 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Andy Burns 2005-12-23 23:36:42 UTC
Description of problem:

Due to a kernel issue I cannot currently install rawhide using vga/tty1 as the
console (that is not the issue I am reporting here) but I am able to boot the
install CD with a serial console, anaconda proceeds ok in text mode on the
serial console, until just after the notification of where the instala.log will
be written, then crashes and backtraces.

Version-Release number of selected component (if applicable):

rawhide 2005-12-23

How reproducible:

100%

Steps to Reproduce:
1. boot from boot.iso cd or pxelinux
2. enter "linux console=ttyS0,115200 text"
3. take all defaults other than dhcp/http settings
  
Actual results:

[I hope bugzilla keeps the fixed font formatting]

Welcome to Fedora Core
               +-------------+ Exception Occurred +-------------+
               |                                                |
               | Traceback (most recent call last):             |
               |   File "/usr/bin/anaconda", line 1188, in ?  : |
               |     intf.run(id, dispatch)                   : |
               |   File "/usr/lib/anaconda/text.py", line     : |
               | 559, in run                                  : |
               |     dispatch.gotoNext()                      : |
               |   File "/usr/lib/anaconda/dispatch.py",      : |
               | line 144, in gotoNext                        : |
               |     self.moveStep()                          : |
               |   File "/usr/lib/anaconda/dispatch.py",      : |
               | line 215, in moveStep                        : |
               |     rc = apply(func, self.bindArgs(args))    : |
               |   File "/usr/lib/anaconda/packages.py",      : |
               | line 250, in turnOnFilesystems               : |
               |     partitions.doMetaDeletes(diskset)        : |
               |   File "/usr/lib/anaconda/partitions.py",    : |
               | line 1254, in doMetaDeletes                  : |
               |     lvm.vgactivate()                         : |
               |   File "/usr/lib/anaconda/lvm.py", line      : |
               | 88, in vgactivate                            : |
               |     searchPath = 1)                          : |
               |   File "/usr/lib/anaconda/iutil.py", line    : |
               | 36, in execWithRedirect                        |
               |     childpid = os.fork()                     : |
               | OSError: [Errno 12] Cannot allocate memory   : |
               |                                                |
               |    +----+       +--------+       +-------+     |
               |    | OK |       | Remote |       | Debug |     |
               |    +----+       +--------+       +-------+     |
               |                                                |
               |                                                |
               +------------------------------------------------+

  <Tab>/<Alt-Tab> between elements   |  <Space> selects   |  <F12> next screen

Expected results:

Actual instalation proceeds at the point of the crash

Additional info:

I don't have a floppy on this machine so can't save the trace to there, if I try
to save to a remote machine, a further crash occurs

Traceback (most recent call last):
File "/usr/bin/anaconda", 
  line 1192, in ? handleException(dispatch, intf, sys.exc_info())
File "/usr/lib/anaconda/exception.py", 
  line 414, in handleException scpRc = copyExceptionToRemote(intf)
File "/usr/lib/anaconda/exception.py", 
  line 225, in copyExceptionToRemote import pty ImportError: No module named pty
install exited abnormally
                                                                             
sending termination signals...done
sending kill signals...done
disabling swap...
unmounting filesystems...
  mnt/runtime done
  disabling /dev/loop0
  /proc/bus/usb done
  /proc done
  /dev/pts done
  /sys done
  /tmp/ramfs done
  /selinux done
you may safely reboot your system

I like the easter egg ...

"anaconda installer init version 10.90.23 using a serial console
remember, cereal is an important part of a nutritionally balanced breakfast."

Comment 1 Chris Lumens 2006-01-04 20:13:16 UTC
How much memory do you have?  If you switch over to tty2 before the exception,
does /usr/lib/python2.4/pty.py exist?  You can also try running:

python -c "import pty"

and seeing if that throws any errors.

Comment 2 Andy Burns 2006-01-04 22:41:37 UTC
I have 1GB

I will try a fresh install using today's rawhide (using the serial console even
though the kernel issue is fixed and I can use the vga/tty again)



Comment 3 Andy Burns 2006-01-05 00:01:15 UTC
Tested again using today's rawhide ...

First I attempted to install the i386 version, paused the installation at the
stage before the crash and went back to the vga console (remember this is a
serial console install) I could switch VTs OK, however there is nothing on VT1
(expected) and bash is not running on VT2 (a consequence of console=ttyS0?) so I
can't check for the pty.py file

I can see anaconda's output on VT3 and VT4 and have posted photos of them, but
nothing looks "interesting" to me

http://adslpipe.co.uk/32bitvt3.jpg
http://adslpipe.co.uk/32bitvt4.jpg

If I go back to the serial console and proceed with the installation, it still
crashes and gives the same backtrace as originally reported.

I then ran a couple of tests using the x86_64 bit version, with different
results, not quite the same each time ...

1st run, anaconda hung at the dependency check, I initially though it was caused
by the "unresolvable dependency on libgtop-2.0.so" but then noticed that the
i386 install had that same message and had carried on.

2nd run, the oom killer stepped in and killed anaconda! 

http://adslpipe.co.uk/64bitvt4.jpg

So I think the initial "cannot allocate memory" error is correct, and something
is leaking badly, but only when doing a serial console install, this is being
caught cleanly by python on 32bit installs, but only caught/hung at the kernel
level on 64bit installs.



Comment 4 Jeremy Katz 2006-01-06 03:52:15 UTC
So it's fine if you're not doing a serial install?  I can't think of how they'd
be different here.  And I haven't hit problems on my test box with a gig of RAM.

Comment 5 Andy Burns 2006-01-06 09:21:40 UTC
> So it's fine if you're not doing a serial install? 

Absolutely fine now on vga.

> I haven't hit problems on my test box

Have you tried a serial install on both i386 and x86_64?
The traceback is 100% for me on i386, and always causes "some" problem on
x86_64, mostly it just hangs, the oom has only happened once.

Shall I continue testing? Anything I can try to reveal more about the bug?

I don't particuarly need serial console on this desktop, just use it for
capturing panic information, but we have servers in a remote datacenter where we
use serial bios redirection + console for "last chance" remote access. I would
hate to find a problem on one of those machines, want to re-install remotely but
have to make a 200 mile round trip, because of this bug :-(



Comment 6 David Cantrell 2006-01-18 20:06:38 UTC
The serial install problem reported isn't happening for me in FC5 Test 2.  Can
you test with that version?  I've just done several serial console installs
(9600bps, 8-N-1) and it's working fine.

Comment 7 Andy Burns 2006-01-19 21:13:22 UTC
I'm not quite in a position to reformat the test box at the moment, I tried a
re-install without format, it picked up all my partitions ok, but apparently in
text mode it won't allow editing LVM to specify which partition is / so I can't
proceed for \ day ot two, but I will ... 

Feel free to dump this straight back to NEEDINFO_REPORTER so I don't forget,
just didn't want you to think it was a bug-and-run ...



Comment 8 David Cantrell 2006-01-19 21:15:07 UTC
Doing so now.  Here's your reminder.

Thanks.

Comment 9 Andy Burns 2006-01-21 20:01:17 UTC
Installing sing rawhide 2005-01-20, which is what this machine was running up
until being wiped.

I backup up the parts of the machine I wanted to keep, did a serial install set
to wipe all partitions, install default packages, but it crashed at 30% of the
dependency check, repeated twice to see if it was reproducible and it was, tried
again with "debug" on the linux command line, this didn't add any further
information, tried in text mode on vgaconsole instead of serial, same problem.

When in vgacon mode I switched VTs to see if any of them revealed further
information, on VT4 there was our old friend the OOM killer, having killed
anaconda, see http://adslpipe.co.uk/oomanaconda.jpg

In graphical mode it installs without a hitch, but I conclude that in text mode
rawhide just isn't installable on this system, this issue has persisted for
almost a month, one way or another it always runs out of memory or crashes due
to lack of memory (or possible heap/slab corruption?) worth running it past
davej for his input as a kernel rather than anaconda issue?

      +----------------------+ Dependency Check +-----------------------+
      |                                                                 |
      | Checking dependencies in packages selected for installation...  |
      |                                                                 |
      |                               30%                             install
exited abnormally
      |                                                                 |      
                sending termination signals...done
      +-----------------------------------------------------------------+      
                                                  sending kill signals...done
                         disabling swap...
                                          unmounting filesystems...
                                                                       
/mnt/runtime done
                                                                               
                disabling /dev/loop0
                                                                               
                                        /proc/bus/usb done
        /proc done
  <Tab>/<Alt-Tab> betwee/dev/pts done|  <Space> selects   |  <F12> next screen
                                        /sys done
                                                        /tmp/ramfs done
                                                                        /selinux
done
                                                                               
     you may safely reboot your system



Comment 10 David Cantrell 2006-01-23 21:41:41 UTC
We're getting the OOM killer in a lot of other yum-related bug reports.  We're
looking in to the issue.  The dependency/memory problem is with the yum backend
(probably...something is leaking).

Comment 11 Jeremy Katz 2006-01-31 20:48:58 UTC
This should be somewhat better now.

Comment 12 Andy Burns 2006-01-31 21:07:34 UTC
OK, I just went to do a PXE boot to try this, halfway through I had a "doh!"
moment as I realised this won't be in rawhide yet, but by that time the machine
was just starting anaconda, unfortunately it failed due to not finding
libexpat.so.0 in __init__.py, will this be a show stopper for me testing this
tomorrow, or has it been fixed already?

on x86_64 btw.


Note You need to log in before you can comment on or make changes to this bug.