Bug 3609

Summary:	Numerous problems with initscripts 4.21
Product:	[Retired] Red Hat Raw Hide	Reporter:	Brian Ryner <bryner>
Component:	initscripts	Assignee:	Bill Nottingham <notting>
Status:	CLOSED NEXTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	medium
Version:	1.0	CC:	rvokal, saurik
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	1999-06-23 19:24:29 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Brian Ryner 1999-06-21 04:39:46 UTC

After upgrading to initscripts 4.21 I experienced the
following problems:

- The network scripts never terminated when I connected by
ppp (ip-up, ifup-post, etc).

- On booting, I received the message:
/var/run/runlevel.dir: read-only filesystem
presumably because it was trying to write to this file on
the root partition before it had been remounted read/write.

- The boot hung entirely at "bringing up interface lo"

Comment 1 Bill Nottingham 1999-06-21 17:01:59 UTC

- what were the network scripts doing when they were hanging
(of course, if ip-up hangs, ifup-post will still be around, because
that's what calls it...)
- This shouldn't happen. :)  What it sounds like is happening
is that rc.sysinit is exiting prematurely; if you watch it booting,
does it seem like rc.sysinit is completing successfully?
- What's your networking config look like - are you running the
stock kernel, or a custom one?

Comment 2 Brian Ryner 1999-06-21 17:15:59 UTC

I'm not sure exactly what the network scripts were doing when they
hung.  I connected using kppp, and doing a "ps ax" showed the scripts
were still running.  Come to think of it... are those scripts supposed
to run at all using kppp?  I believe it invokes pppd directly... but
at any rate, this didn't happen with previous initscripts releases.

I don't still have initscripts 4.21 installed, so I can't really say
what would have caused these things to happen.  I am running
kernel-2.2.5-23 from rawhide.  I don't have any network cards
installed.

Comment 3 Jay Freeman 1999-06-21 18:36:59 UTC

I am experiencing the same bootup problem (just installed RawHide on
a computer at work, complete install except for xanim and aktion).
Been trying to isolate what is causing it, and it _does_ seem that
rc.sysinit doesn't want to complete.  With the original scripts, the
computer would get to the swapon, not complete that, and immediatly
go into trying to bring lo online, which of course failed, and caused
the computer to just lock up.  Am workin on commenting out various
lines in rc.sysinit to get it to work (temporarily commented the
various run level lines in /etc/inittab as well to clean up the
screen during bootup).

Comment 4 Bill Nottingham 1999-06-21 18:38:59 UTC

Duh. Actually, pppd invokes /etc/ppp/ip-up, which invokes
/etc/ppp/ip-up.local & ifup-post. So, yes, they would run
with kppp. They are still running as opposed to just being
zombies, correct?

As for the others, I hate to say it, but unless I can diagnose
what the problem is, I can't fix it. :(

I'll try and futz with them some around here to see if I can
reproduce it, but they seemed to run OK on the one test machine
I booted them up on...

Comment 5 Jay Freeman 1999-06-21 19:13:59 UTC

For some reason, init is trying to go into run level 5 (I chose to
load directly to X during setup) BEFORE sysinit is done running.
What I am gettin gis:

Loading default keymap
INIT: Entering runlevel: 5
INIT: camnnot execute "/etc/X11/prefdm"
<another couple times>
Checking root filesystem
<more prefdm>
/dev/hda1 was not cleanly unmounted, check forced.
past fsck...
<more prefdm>

The "past fsck..." is an echo I placed one line below where fsck is
being run, strangly enough, fsck doesn't seem to be closing correctly
(since there wasn't any fragmentation report), but the line after it
in rc.sysinit continued to execute (and past that, no more evidence
of rc.sysinit operating, will add more echos to see where it is dying
after that).  Another thing of note, I don't remember seeing this
before (but then again I don't boot often and probably wouldn't
remember anyway):

change_root: old root has d_count=1
Trying to unmount old root ... okay

Comment 6 Bill Nottingham 1999-06-21 19:52:59 UTC

Hmm... what will probably solve it (in the meantime)
is if you comment out the
'Rerun ourselves through initlog' section at the beginning
of rc.sysinit.

If you're really adventurous, you can change that line
to:
[ -f /sbin/initlog ] && exec /usr/bin/gdb /sbin/initlog...

and see what's going on in gdb. ;)

Comment 7 Jay Freeman 1999-06-21 20:39:59 UTC

Ok, I went with the gdb :)  I was able to get:

Setting clock  (utc): ....   [  OK  ]
Loading default keymap       [  OK  ]
/etc/rc.d/rc.sysinit: /etc/sysconfig/i18n: No such file or directory
Program recieved signal SIGSERV, Segmentation fault.
chunk_alloc (ar_ptr=0x40102580, nb=360) at malloc.c:2723
2723    malloc.c: No such file or directory

At this point, I was given a (gdb) prompt, but the thing continued
anyway, and everything went smoothly from that point forward.

Next thing it did was:

Activating swap partitions   [  OK  ]
Setting hostname olympus     [  OK  ]

etc...

A backtrace brought me through chunk_alloc, then __libc_malloc, then
6 strcpy's, and finally to __libc_start_main.  What bothers me though
is that it kept on going since it seems strange that the file would
fork (might be doing that, just don't REMEMBER it doing it, hehe).  I
might try doing a strace through the initlog as well (and maybe
compiling myself a debug version of the binary if I can get to the
point where TCP/IP works, since using gdb is allowing me a chance to
break out to a shell and do stuff).

Comment 8 Bill Nottingham 1999-06-21 21:45:59 UTC

OK, reproduced somewhat.

To fix the loopback problems, edit
/etc/sysconfig/network-scripts/ifup-aliases,
and change the 'return' to an 'exit 0' (I'm a dolt...)

Working on the crashing initlog problem now...

Comment 9 Jay Freeman 1999-06-21 22:46:59 UTC

Ok, got a debug binary compiled, and got the following backtrace from
gdb:

chunk_alloc < malloc.c:2723
__libc_malloc < malloc.c:2616
poptGetContext()
processArgs(8, ..., 1) < initlog.c:238
monitor("rc.sysinit", 19, 1, ..., 1, 0, 0) < process.c:166
runCommand("/etc/rc.d/rc.sysinit", 1, 0, 0) < process.c:220
processArgs(3, ..., 0) < initlog.c:301
main(3, ...) < initlog.c:315

Stepping through doesn't show much because A) it doesn't crash (hehe)
and B) if I try to follow the child at the fork(), gdb can't access
that memory and simply follows the parent, which closes, and then it
doesn't do that thing where it continues in the background.

Comment 10 Bill Nottingham 1999-06-21 22:51:59 UTC

Yup, initlog forks alot. I found the bug, it's fixed in
initscripts-4.22, which will be in the next rawhide,
and can be found in the meantime at
http://charlotte.redhat.com/~notting/ftp/initscripts/
...

Comment 11 Brian Ryner 1999-06-21 23:11:59 UTC

Thanks for the speedy fix.  Should this fix the problem of the network
scripts hanging around after starting ppp, or is that not yet
resolved?

Comment 12 Bill Nottingham 1999-06-21 23:20:59 UTC

yes, it's the same thing that would be hanging it
after the 'bringing up lo'.