From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a) Gecko/20031221 Firebird/0.7+ Description of problem: All my attempts to install this rpm have led to a total crash of the system. There is and extensive trace generated and dumped to the console screen. I have not been able to find a core yet (I may need to change some settings to get the core - I'll try and do as soon as I have the time). The computer is a Dell Inspiron 7500 laptop (p2 moblie 450 CPU, 256 MB RAM) Failure occurs in any system configuration I have tested - run level 3, run level 1, network up, no network running, .... I have not tried running the rpm install with noscripts or anything esoteric like that. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. rpm -Uv dev-3.3.9-2.i386.rpm 2. wait 3. console filled with traceback of some sort 4. cycle power because nothing else will bring the machine back Additional info:
when run with 'rpm -Uvvvh' the last entry created is /dev/compaq
What kernel are you running under? Do you have a kernel trace?
Created attachment 96669 [details] Kernel trace 2.6.0-0.1.14 with selinux=0 Oops attached. Once it had oopsed further actions such as shutdown -h now caused oops. Had to sync/remount/poweroff with sysrq I also have a script output from rpm -Uvvh if that helps
I'm also on 2.6.0-0.1.14 kernel. Not sure how to generate an Oops to attach. I will need to wait until tonight to set up a serial capture, I think. Any other way to get an Oops for you?
Created attachment 96670 [details] kernel trace with selinux off Includes startup messages so you can see kernel command line
OK - now that I've figured out hot to make an oops, I should note that the trace with SElinux enabled is A LOT longer. Is there any value to that as well? I'll assume no, unless someone asks for it.
Should this actually be filed against kernel? Got caught by it again on another rawhide sync.
I don't know if it should be filed against kernel or dev - I guessed, but I suspect either is justifiable. Since I update as regularly as I can, I installed the rpm with the --justdb flag for now until I see some action on this issue. But I also checked last night and it still crashes my machines dead too.
Still occuring against rawhide whilst updating under kernel 2.6.0-1.21 and upgrading to dev 3.3.9-2. Booting into 2.4.22-1.2149 update works as expected.
Hmm: can you do a "fsck" to check the root fs? It's really useful to know whether such problems are being triggered by something not right on the filesystem, or whether it's repeatable on an error-free partition.
fsck says all clear I have been meaning to get back to this. dev-3.3.9-2 successfully applied maybe 2 kernels ago - I'll have to check dates after I finish getting the current oops. I was just about to report this bug closed, but then dev-3.3.10-1 came along and broke things again. I'm still colecting info, but the oosp starts like: Unable to handle kernel paging request at virtual address c2f84e48 printing eip: c01aa216 *pde = 0000b067 Oops: 0002 [#1] CPU: 0 EIP: 0060:[<c01aa216>] Not tainted EFLAGS: 00010297 EIP is at sync_sb_inodes+0x56/0x3d0 eax: c9808e44 ebx: cf5d5ca0 ecx: cf5d5c98 edx: c2f84e44 esi: cf5d5ca0 edi: cf5d5bf8 ebp: 00000000 esp: cfa63e7c ds: 007b es: 007b ss: 0068 Process pdflush (pid: 7, threadinfo=cfa62000 task=cfa8e960) Stack: 00000000 00000296 c0465dc0 00000000 c026b63c 00000064 00000000 cf5d5ca0 00000246 cf5d5c5c ffffe41a cf5d5bf8 cfa63ef0 00002f22 00000000 c01aa6fc cf5d5bf8 cfa63ef0 cfa63ef0 cfa63fd0 cfa63f10 00000000 c0154778 cfa63ef0 Call Trace: [<c026b63c>] blk_congestion_wait+0x8c/0xa0 [<c01aa6fc>] writeback_inodes+0x16c/0x450 [<c0154778>] get_page_state+0x18/0x20 [<c01557a7>] wb_kupdate+0xa7/0x120 [<c015612d>] __pdflush+0x21d/0x620 [<c0156530>] pdflush+0x0/0x20 [<c015653f>] pdflush+0xf/0x20 [<c0155700>] wb_kupdate+0x0/0x120
Created attachment 96989 [details] dev-3.3.10 oops selinux=0
after installing kernel-2.6.1-1.43 I was able to succesfully install dev-3.3.10-1 If you're still listening, Paul, have your tried that combination?
Karl --- just to be sure, could you try that more than once, just to be sure? You can re-install an already-installed rpm with "rpm -Uvh --force" for testing purposes. I'd like to double-check before we close this. Thanks!
Created attachment 97057 [details] kernel message with 2.6.1-1.43 and dev-3.3.10-1 search for "Preparing" to get past selinux warnings note messages from "ext3_destroy_inode" at end of file, note that two orphaned inodes have been created. (I did an fcsk and reboot immediately previous to this installation)
To be clear, rerunning the install works on the surface - the install completes, and the system behave normally. Ext3 errors just tend to worry me, but if that is a concern, it might not be realted to the dev package for all I know. It may need to ne a new bugzilla. Also, I'm not 100% sure that dev is causing the orphaned inodes - it could well be something else that happens in bootup.
did a few more reboots, and I'm now pretty confident that the orphaned inodes are from the rpm install of the dev package
Yes I got ext3 errors/orphaned inodes after dev install on that kernel although it didn't hang. Though a combination of that and most likely my own error, / seems to be hosed. I do feel there is possibly still an issue here :(
Orphans can be a natural consequence of such updates. If you've got a file open and you delete the file and recreate it, the orphan remains as long as the original, deleted inode is open. If the application which opened it doesn't ever close it, then the kernel *cannot* reclaim it until the next reboot. In such cases, orphans are simply a sign of the kernel doing its job correctly. The Slab corruption: start=c9108084, expend=c9108303, problemat=c9108154 Last user: [<d088dfdd>](ext3_destroy_inode+0x1d/0x30 [ext3]) are a different matter entirely, and I'll try to recreate that here.