176492 – Anaconda memory usage expands during "step reposetup"; Installer crash

Bug 176492 - Anaconda memory usage expands during "step reposetup"; Installer crash

Summary: Anaconda memory usage expands during "step reposetup"; Installer crash

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	anaconda
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Anaconda Maintenance Team
QA Contact:	Mike McLean
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	178385 180462 (view as bug list)
Depends On:
Blocks:	FC5Blocker
TreeView+	depends on / blocked

Reported:	2005-12-23 15:19 UTC by W. Michael Petullo
Modified:	2007-11-30 22:11 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-02-16 21:21:11 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description W. Michael Petullo 2005-12-23 15:19:49 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux ppc; en-US; rv:1.7.12) Gecko/20051215 Epiphany/1.9.3.1

Description of problem:
I am trying to do a HTTP installation on a VIA EPIA ME6000-based system with 256MB of memory.

Version-Release number of selected component (if applicable):
Raw Hide 12/22/05

How reproducible:
Always

Steps to Reproduce:
Choose HTTP install.  The installation procedes to the "reposetup" step.  A progress bar is displayed and reaches 98%.  The the installation process freezes, apparently after running out of memory.

Actual Results:  One TTY says:
... : moving (1) to step accounts
... : moving (1) to step reposetup
... : Warning, could not load sqlite, falling back to pickle

One TTY says:
Installation Progress
Retreiving installation information...
                98%

One TTY says:
<4>Swap cache: add 0, delete 0, find 0/0, race 0+0
<4>Free swap = 0kB
<4>Total swap = 0kB
<6>Free swap:           0kB
<6>57328 pages of RAM
<6>0 pages of HIGHMEM
<6>1412 reserved pages
<6>58 pages shared
<6>0 pages swap cached
<6>0 pages dirty
<6>0 pages writeback
<6>42335 pages mapped
<6>2100 pages slab
<6>74 pages pagetables
<3>Out of Memory: Killed process 456 (sh)

Additional info:

The installation process does not set up the system's swap space yet (see #114069 and #99520?)  However, even if I set up 256MB of swap space manually, after the disk partitioning step, the system still runs out of memory.  With the swap space enabled it just takes much longer to consume all of the memory.

Comment 1 Jeremy Katz 2006-01-06 03:50:52 UTC

Is this better with newer rawhide?  The kernel was enabling some malloc tracking
for a while that massively increased the memory consumption of the installer...

Comment 2 W. Michael Petullo 2006-01-13 01:20:54 UTC

I just tried this again with a diskboot.img that I downloaded from Raw Hide on
12 Jan 06.  I had the same problem.

Comment 3 Will Woods 2006-01-16 23:22:23 UTC

Reproduced here, doing an NFS install on a machine with 128MB RAM - the machine
locks up and dies while solving dependencies.

In graphical mode the machine hard-locks. In text mode switching VTs works and
ctrl-alt-del works, but the shell is unresponsive and no progress appears to be
made.

When I tried setting up swap by hand just before the postselection stage, it
successfully gets to confirminstall. If I try to swapoff at that point, the
OOM-killer takes out Xorg and the install fails. If I *don't* swapoff here, the
mkswap/swapon fails in the filesystem step, and the installer exits.

Comment 4 Chris Lumens 2006-01-17 21:14:52 UTC

One idea is to move the postselection step to after the filesystems are enabled.
 I've tested this and it still works, though I don't have a machine with the
memory pressure problem to begin with.  Will - I'll supply you with an updated
dispatch.py for you to test out in your installs.

The problem here is that the dependancy resolution now comes after filesystems
are created.  If there are dependancy problems, you won't be able to go back and
do anything about it (not like we allow that now, but...).  You'll also have
just trashed whatever was on the disks without knowing if you've even got an
installable set of packages.  I suppose this isn't so big of a problem because
we do have the confirmation screen in there.

Another idea is if we can just get yum's memory usage down.  I presume that's
where all the problems are coming from.  Jeremy, Paul - thoughts?

Comment 5 Will Woods 2006-01-18 14:05:36 UTC

I tested Chris' changes on my low-memory (128MB) machine, and got a fatal traceback:

OSError: [Errno 12] Cannot allocate memory
...
argv: ['lvm', 'vgchange', '-ay']

So anaconda could not exec LVM to create the swap partition, because it's already out of memory. 
Checking ps shows anaconda to be the culprit, using all available RAM in 2 processes (one using 128MB 
(VSZ) and the other using 34MB).

I'm starting to suspect a memory leak or something. I don't think anaconda normally needs 128MB+ of 
RAM to function, does it?

Comment 6 David Timms 2006-01-21 01:54:40 UTC

I see the same result installing fc5t2 on a (old) HP Omnibook 4150 notebook
(128M ram), booting the boot.iso, for a ftp install and ks.  (The disk is also
quite contrained at 1.2G!)

  Installation Progress|Retrieving installation information...|98%
hang, but can still switch to screens

Alt-F3:
02:21:04 WARNING : step fixupconditionals does not exist
02:21:04 WARNING : step complete does not exist
repeat x 7
02:21:04 INFO    : moving (1) to step partitionobjinit
02:21:06 INFO    : moving (1) to step autopartitionexecute
02:21:07 INFO    : moving (1) to step partitiondone
02:21:07 INFO    : moving (1) to step bootloadersetup
02:21:07 INFO    : moving (1) to step networkdevicecheck
02:21:07 INFO    : moving (1) to step reposetup
02:21:07 WARNING : Warning, could not load sqlite, falling back to pickle
02:21:07 WARNING : /usr/lib/python2.4/site-packages/snack.py:250:
DeprecationWarning: integer argument expected, got float
  self.w = _snack.scale(width, total)
  
02:21:07 WARNING : /usr/lib/python2.4/site-packages/snack.py:247:
DeprecationWarning: integer argument expected, got float
  self.w.scaleSet(amount)
===
Alt-F4:
<4>Swap cache: add 0, delete 0, find 0/0, race 0+0
<4> Free swap  =0kB
<4> Total swap =0kB
<4> Free swap:              0kB
...
<6>51 pages pagetables
<3>Out of memory: Killed process 409 (sh).
===
Alt-F2 is not accepting keyboard entry.
---
I would expect that package resolution would need some memory...but I haven't
got to chose packages yet.

Would a #vmstat 2 monitor be useful ? Looks like its not available on the boot
image (installer), and if provided by floppy misses out on libproc-3.2.5. I
notice also that Alt-F2 console is now busybox, is this more memory hungry ?

Comment 7 David Timms 2006-01-21 02:13:13 UTC

from another install attempt, without the default lvm config. A 20M /boot, 256
swap and 980M /.
top: shows when the main screen stops at 98%.
Mem: 123932K used, 2352K free, 0K shrd, 1872 buff, 43028K cached
Load average: 0.55, 0.14, 0.04
cpu%  mem% command
75.5  56.9 anaconda
===
Does this mean that there is available memory (2352K), and if it needs it the
buff or cache should be discardable (ie another 44M available ?) ?

Comment 8 David Timms 2006-01-21 11:49:51 UTC

Another install attempt, this time with 256M ram. exact same results. noGo.
Tried 128+256: defaults to graphical boot and this gets to package selection OK,
and install completes, boots OK.
With the 128+256: linux text to install in text mode:
Even with enough RAM to run the gui installer, the text mode installer carks it!
As it starts the downloading information part, 
graph% cpu% mem%   
90     97.5 14.9
~93    96   78   
~96    96   81
~98    95   84  of 384M !
Perhaps there is a infinite loop, causing near 100% cpu, and then the OOM-killer
takes out (this time) anaconda.

Comment 9 David Timms 2006-01-22 05:36:14 UTC

for a giggle, same text mode install on a hp nx9000 (512Ram). Get's past the
downloading information, but instead has issue after selecting (all) packages.
The screen is dependency check|checking dependencies in packages selected for
installation, hangs at 9%. top shows %ram to 85.4%. switching between tty's is
ok, but I couldn't quit out of top.

Comment 10 David Timms 2006-01-22 05:46:29 UTC

wtf does bugzilla dodgy up the line breaks, and make posts look ugly ;(
--
Same hp nx9000 (512MB), Choose no packages. Hang on same screen, but at 34%.
ps-
I noticed also, that the new grouping plan present in the gui is not implemented
(yet). Will it (eventually) be ?
Perhaps this should be a severity=high bug; you can only work around it I guess
if you have enough RAM, but text mode should be using less RAM than needed by
the GUI ?

Comment 11 Chris Lumens 2006-01-23 21:59:08 UTC

Is everyone who is hitting this bug seeing the following message:

02:21:07 WARNING : Warning, could not load sqlite, falling back to pickle

We have tested an install with mem=256M and haven't seen a problem, though we
did get shell lockups at 128M.

Comment 12 W. Michael Petullo 2006-01-28 23:35:12 UTC

Yes, I see the message:

07:34:56 INFO : Warning, could not load sqlite, falling back to pickle

I still see an out of memory error (Raw Hide 28 Jan 06.) with my 256MB system.

Comment 13 David Timms 2006-01-30 18:38:28 UTC

Yes, I also see the message on a HP notebook omnibook 4150 (384M), the nightly
build (2006-01-30) of boot.iso + repodata/base, CD boot, :linux text  the
messages are shown on Alt-F3:

time  WARNING : /usr/lib/pyhton2.4/site-packages/snack/py:247:
DeprecationWarning: integer argument expected, got float
  self.w.scaleSet(amount)
time INFO     : moving (1) to step findinstall
...
time INFO     : moving (1) to step reposetup
time INFO     : Warning, could not load sqlite, falling back to pickle
---
Installation Progress|Retrieving installation information...
makes good progress up to 98%, during which time mem% usage in top -d 1 is going
up by approximately 0.3% per second (update). Once it hits 98%, the last 4 secs,
the usage goes up from 23%-45%-78%-85.2%, where I think something decides "it's
all turned to hell", and top no longer updates, and you cant exit from it, but
you can still switch between terminals.
A short while later you see: (a bit garbled over the top of the progress dialog:
install exited abnormally
sending termination signals...done.
===
Also noticed that before getting to this stage Alt-F1 briefly shows the message:
/usr/sbin/load_policy Can't load policy: No such file or directory
before/while anaconda starts. This might be caused by not having a full install
tree ?
Is there some debug process that can provide further information ?

Comment 14 Paul Nasrat 2006-01-30 19:02:41 UTC

Thanks David.  The load_policy is unrelated to this. What method of install did
you choose (nfs, http, etc?).

Comment 15 David Timms 2006-02-02 13:17:31 UTC

> What method of install did you choose (nfs, http, etc?).
HP Omnibook 4150 (384M, 1.2G disk, Xircom RealPort Cardbux 10/100 + modem pc card)
boot.iso (tried again today using 2006-02-01  6 453 248 B, same result)
linux text
ftp
192.168.2.115 (local subnet)
linux/fedora/core/5dev/disc
halts at 98%  (with 128 or 256 or 384M ram)

Comment 16 Paul Nasrat 2006-02-02 15:12:46 UTC


*** This bug has been marked as a duplicate of 179547 ***

Comment 17 Paul Nasrat 2006-02-02 18:19:36 UTC

Accidently closed sorry.

I've traced this down to sqlite not being available in minstg2 this will be
present from tomorrows rawhide, please try a http or ftp mode install on a tree
with a newer anaconda.

Comment 18 David Timms 2006-02-04 07:48:58 UTC

re: this rawhide report:
anaconda-10.91.13-1
-------------------
* Thu Feb 02 2006 Jeremy Katz <katzj> - 10.91.13-1
...
- Add sqlite to traceonly to help http/ftp memory usage

I don't know how to determine if this anaconda / sqlite made it into minstg2
(feb 3 07:24:00, 25,858,048B ?
With linux text install of the above, I still see the same OOM murder (sh) at 98%.

A gui install proceeds OK (384M), same laptop (omnibook 4150).

Comment 19 David Timms 2006-02-05 12:04:17 UTC

Repeated with: (feb 4 07:22  25,808,896), same laptop with linux text, ftp
install. 98% oom death.

# file minstg2.img
  squashfs (ah huh, got ya!).

# mount -o loop -t squashfs ... mount

# ls -l mount/usr/lib/libsq*
lrwxrwxrwx  1 root root     19 Feb  4 18:22 usr/lib/libsqlite3.so.0 ->
libsqlite3.so.0.8.6
-r-xr-xr-x  1 root root 369420 Feb  4 18:21 usr/lib/libsqlite3.so.0.8.6
so the lib is there...

When compared to stage2.img, the only other filenames mentioning sqlite are:
mount2/usr/lib/python2.4/site-packages/yum:
total 388
...
-r--r--r--  1 root root 15978 Feb  4 07:03 sqlitecache.py
lrwxrwxrwx  1 root root     9 Feb  4 18:23 sqlitecache.pyc -> /dev/null
-r--r--r--  1 root root 22195 Feb  4 07:03 sqlitesack.py
lrwxrwxrwx  1 root root     9 Feb  4 18:23 sqlitesack.pyc -> /dev/null

Comment 20 David Timms 2006-02-06 14:06:16 UTC

minstg2.img (2006-02-06 7:19) image still misses sqlite.
I tried copy stage2 as the minstg2 image, and was able to do the text mode
install from it, so I guess there is still some files needed ?

Comment 21 Jeremy Katz 2006-02-06 15:22:29 UTC

The python-sqlite stuff wasn't in the file list -- I've added  it now

Comment 22 Paul Nasrat 2006-02-08 13:10:07 UTC

*** Bug 180462 has been marked as a duplicate of this bug. ***

Comment 23 Paul Nasrat 2006-02-10 21:35:43 UTC

*** Bug 178385 has been marked as a duplicate of this bug. ***

Comment 24 David Timms 2006-02-16 20:26:41 UTC

Repeated with: (boot.iso:feb 16 16:16, minstg2.img feb 16 16:17  26,861,568),
same laptop with linux text, ftp install. OK.

Instead of the pickle message I see: primary sqlite cache needs updating,
reading in metadata, and the installer now proceeds past the 98% mark and onto
package selection, excellent!  Consider it resolved - rawhide.

Note You need to log in before you can comment on or make changes to this bug.