Red Hat Bugzilla – Bug 112544
installing dev-3.3.9-2 rpm dumps trace to console and crashes
Last modified: 2007-04-18 13:00:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a)
Description of problem:
All my attempts to install this rpm have led to a total crash of the
system. There is and extensive trace generated and dumped to the
console screen. I have not been able to find a core yet (I may need to
change some settings to get the core - I'll try and do as soon as I
have the time).
The computer is a Dell Inspiron 7500 laptop (p2 moblie 450 CPU, 256 MB
Failure occurs in any system configuration I have tested - run level
3, run level 1, network up, no network running, ....
I have not tried running the rpm install with noscripts or anything
esoteric like that.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. rpm -Uv dev-3.3.9-2.i386.rpm
3. console filled with traceback of some sort
4. cycle power because nothing else will bring the machine back
when run with 'rpm -Uvvvh' the last entry created is /dev/compaq
What kernel are you running under?
Do you have a kernel trace?
Created attachment 96669 [details]
2.6.0-0.1.14 with selinux=0
Oops attached. Once it had oopsed further actions such as shutdown -h now
caused oops. Had to sync/remount/poweroff with sysrq
I also have a script output from rpm -Uvvh if that helps
I'm also on 2.6.0-0.1.14 kernel.
Not sure how to generate an Oops to attach.
I will need to wait until tonight to set up a serial capture, I think.
Any other way to get an Oops for you?
Created attachment 96670 [details]
kernel trace with selinux off
Includes startup messages so you can see kernel command line
OK - now that I've figured out hot to make an oops, I should note that
the trace with SElinux enabled is A LOT longer. Is there any value to
that as well? I'll assume no, unless someone asks for it.
Should this actually be filed against kernel? Got caught by it again
on another rawhide sync.
I don't know if it should be filed against kernel or dev - I guessed,
but I suspect either is justifiable.
Since I update as regularly as I can, I installed the rpm with the
--justdb flag for now until I see some action on this issue. But I
also checked last night and it still crashes my machines dead too.
Still occuring against rawhide whilst updating under kernel 2.6.0-1.21
and upgrading to dev 3.3.9-2. Booting into 2.4.22-1.2149 update works
Hmm: can you do a "fsck" to check the root fs? It's really useful to
know whether such problems are being triggered by something not right
on the filesystem, or whether it's repeatable on an error-free partition.
fsck says all clear
I have been meaning to get back to this. dev-3.3.9-2 successfully
applied maybe 2 kernels ago - I'll have to check dates after I finish
getting the current oops. I was just about to report this bug closed,
but then dev-3.3.10-1 came along and broke things again. I'm still
colecting info, but the oosp starts like:
Unable to handle kernel paging request at virtual address c2f84e48
*pde = 0000b067
Oops: 0002 [#1]
EIP: 0060:[<c01aa216>] Not tainted
EIP is at sync_sb_inodes+0x56/0x3d0
eax: c9808e44 ebx: cf5d5ca0 ecx: cf5d5c98 edx: c2f84e44
esi: cf5d5ca0 edi: cf5d5bf8 ebp: 00000000 esp: cfa63e7c
ds: 007b es: 007b ss: 0068
Process pdflush (pid: 7, threadinfo=cfa62000 task=cfa8e960)
Stack: 00000000 00000296 c0465dc0 00000000 c026b63c 00000064 00000000
00000246 cf5d5c5c ffffe41a cf5d5bf8 cfa63ef0 00002f22 00000000
cf5d5bf8 cfa63ef0 cfa63ef0 cfa63fd0 cfa63f10 00000000 c0154778
Created attachment 96989 [details]
dev-3.3.10 oops selinux=0
after installing kernel-2.6.1-1.43 I was able to succesfully install
If you're still listening, Paul, have your tried that combination?
Karl --- just to be sure, could you try that more than once, just to
be sure? You can re-install an already-installed rpm with "rpm -Uvh
--force" for testing purposes. I'd like to double-check before we
Created attachment 97057 [details]
kernel message with 2.6.1-1.43 and dev-3.3.10-1
search for "Preparing" to get past selinux warnings
note messages from "ext3_destroy_inode"
at end of file, note that two orphaned inodes have been created. (I did an fcsk
and reboot immediately previous to this installation)
To be clear, rerunning the install works on the surface - the install
completes, and the system behave normally. Ext3 errors just tend to
worry me, but if that is a concern, it might not be realted to the dev
package for all I know. It may need to ne a new bugzilla.
Also, I'm not 100% sure that dev is causing the orphaned inodes - it
could well be something else that happens in bootup.
did a few more reboots, and I'm now pretty confident that the orphaned
inodes are from the rpm install of the dev package
Yes I got ext3 errors/orphaned inodes after dev install on that kernel
although it didn't hang.
Though a combination of that and most likely my own error, / seems to
be hosed. I do feel there is possibly still an issue here :(
Orphans can be a natural consequence of such updates. If you've got a
file open and you delete the file and recreate it, the orphan remains
as long as the original, deleted inode is open. If the application
which opened it doesn't ever close it, then the kernel *cannot*
reclaim it until the next reboot. In such cases, orphans are simply a
sign of the kernel doing its job correctly.
Slab corruption: start=c9108084, expend=c9108303, problemat=c9108154
Last user: [<d088dfdd>](ext3_destroy_inode+0x1d/0x30 [ext3])
are a different matter entirely, and I'll try to recreate that here.