Description of problem: There are 3 kerneloops in /var/log/messages Every 120 seconds they are entered into /var/cache/abrt as new dumps. The only difference is the time. I have uploaded them with the GUI and they are apparently identified as identical and uploaded, but still reproducing themselves. Version-Release number of selected component (if applicable): 1.0.4-1.fc12 How reproducible: Keeps going on after several reboots also with another kernel version. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I think you can blame me on this, but on the bright side, it is fixed (again) now. Please try abrt-1.0.6 from http://koji.fedoraproject.org/koji/buildinfo?buildID=154271 Please also attach your /var/log/messages to this bug.
Created attachment 390053 [details] /var/log/messages This file makes abrtd generate thousands of identical reports in /var/cache/abtr
I have installed abrt-1.0.6 from the testing repository and it started by deleting the reports, but now I have 3900 reports in /var/cache/abrt !
You can delete /var/cache/abrt, abrtd will recreate ot on restart. However, our oops dup detector in CAnalyzerKerneloops::GetLocalUUID() is really *way* too simplistic: unsigned len = oops.length(); unsigned hash = len; for (unsigned i = 0; i < len; i++) { hash = ((hash << 5) ^ (hash >> 27)) ^ oops[i]; } hash &= 0x7FFFFFFF; Looking at your oops, it's obvious code above will generate different hashes for each new oops: pid, addresses etc will be different: BUG: Dentry ffff88006037e0c0{i=2,n=/} still in use (3) [unmount of cifs cifs] ------------[ cut here ]------------ kernel BUG at fs/dcache.c:669! invalid opcode: 0000 [#3] SMP. last sysfs file: /sys/module/hwmon_vid/refcnt CPU 1. Modules linked in: cpufreq_ondemand cifs nls_utf8 fuse sunrpc powernow_k8 freq_table ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 microcode ext2 uinput snd_hd Pid: 3259, comm: umount.cifs Tainted: G D 2.6.31.12-174.2.3.fc12.x86_64 #1 GA-MA790GPT-UD3H RIP: 0010:[<ffffffff8110c78c>] [<ffffffff8110c78c>] shrink_dcache_for_umount_subtree+0x135/0x1f9 RSP: 0018:ffff88019e913e08 EFLAGS: 00010292 RAX: 0000000000000054 RBX: ffff88006037e0c0 RCX: 00000000000037a2 RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246 RBP: ffff88019e913e38 R08: 0000000000000000 R09: ffff88019d93b2e0 R10: ffffffff814569e0 R11: ffff88019d93b2e0 R12: 0000000000000302 R13: ffff88006037e0c0 R14: ffff88011ddd6c60 R15: ffff88019e913f18 FS: 00007fae82f80700(0000) GS:ffff880028051000(0000) knlGS:00000000f77496c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fae82ad9090 CR3: 0000000195539000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount.cifs (pid: 3259, threadinfo ffff88019e912000, task ffff88011439af00) Stack: ffff880084cb3288 ffffffff81201614 ffff880084cb3000 ffffffffa034e550 ffff880084cb3000 ffffffff816f9160 ffff88019e913e58 ffffffff8110c88c ffffffff8166e9c0 ffff880084cb3000 ffff88019e913e78 ffffffff810fea44 Call Trace: [<ffffffff81201614>] ? __down_read_trylock+0x46/0x4e [<ffffffff8110c88c>] shrink_dcache_for_umount+0x3c/0x4c [<ffffffff810fea44>] generic_shutdown_super+0x1f/0xc9 [<ffffffff810feb43>] kill_anon_super+0x16/0x4f [<ffffffff810ff248>] deactivate_super+0x56/0x6e [<ffffffff81111d3c>] mntput_no_expire+0xb4/0xec [<ffffffff8111230f>] sys_umount+0x2de/0x30d [<ffffffff81104575>] ? path_put+0x22/0x27 [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b Code: 50 30 4c 8b 0a 31 d2 48 85 f6 74 04 48 8b 56 40 48 05 88 02 00 00 48 89 de 48 c7 c7 6e cc 57 81 48 89 04 24 31 c0 e8 42 ed 30 00 <0f> 0b eb fe 4c 8b 6b 28 4c 39 eb 75 05 RIP [<ffffffff8110c78c>] shrink_dcache_for_umount_subtree+0x135/0x1f9 RSP <ffff88019e913e08>
Well, as the files in the different kerneloops directories were identical, the simplistic checking should be OK for not generating new reports from the same error in the logfile. There must be a bug, so that you are actually not comparing the 'new' against the old.
Please, observe, that the 4000 reports were generated from only 3 messages in /var/log/messages.
I did copy the file you provided to my /var/log/messages, and for me, abrtd created 3 oopses. It does not create more oopses as time passes. I believe the fix for the bug where abrtd files new oopses at every log scan pass was applied to git in this commit: commit 640af192338643b3c9e6fbe0304726e951239c2b Author: Denys Vlasenko <vda.linux> Date: Wed Nov 11 19:32:19 2009 +0100 KerneloopsSysLog: fix breakage in code which detects abrt marker and 1.0.6 is much newer than this. If you do observe new oopses being filed at every log scan pass (i.e. every 2 mins or so), please run "killall abrtd", then "abrtd -dvvv 2>&1 | tee LOG", wait for a few kerneloops scans (you will see the corresponding messages appearing), then ^C it and attach resulting LOG file to this bug.
well, hard to make it run constantly now abrtd: Loaded .conf abrtd: can't load '/usr/lib64/abrt/lib.so': /usr/lib64/abrt/lib.so: cannot open shared object file: No such file or directory
Created attachment 390330 [details] first trace
Created attachment 390332 [details] second trace
This should be fixed in 1.0.6 which is already in stable repo. SO I'd recommend to update to the latest version. Jirka
Which I already did. 154112 3 feb 16:09 /usr/sbin/abrtd That version should immediately be withdrawn. abrtd: Loaded .conf abrtd: can't load '/usr/lib64/abrt/lib.so': /usr/lib64/abrt/lib.so: cannot open shared object file: No such file or directory
You're right, it's fixed after 1.0.6 :-/ The only known workaround for this is to remove the contents of /var/cache/abrt. Jirka.
With 1.0.6 there is no problem. It terminates itself after 120 seconds due to the missing lib.so file, so only one set of duplicates are generated ;-)
abrt-1.0.7-1.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/abrt-1.0.7-1.fc12
abrt-1.0.7-1.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update abrt'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1598
Now repeatedly incrementing the bug count in the database instead of creating directories. It still does not recognize old bugs.
abrt-1.0.9 will have different oops hashing algorithm, making it less prone to generate different hash on similar oopses.
well, I think more focus should be payed to why the ONE and SAME oops is not recognized. We are not talking about similar oopses here. We are talking about the same oops only listed ONCE counted over and over again. The current algorithm seems OK for handling that.
abrt-1.0.7-1.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.
Reopening since there seems to be a problem with oops marker and we detect the same oops repeatedly even if it's listed in the /var/log/messages only once.
(In reply to comment #19) > well, I think more focus should be payed to why the ONE and SAME oops is not > recognized. We are not talking about similar oopses here. We are talking about > the same oops only listed ONCE counted over and over again. The current > algorithm seems OK for handling that. Yes. I want to do this. But (as I already said): I did copy the file you provided to my /var/log/messages, and for me, abrtd created 3 oopses. It does not create more oopses as time passes. I am looking at your "abrtd -dvvv" logs. I see that you was bitten by dreaded "can't load '/usr/lib64/abrt/lib.so'" bug. This is caused by abrtd trying to load plugin with name "" (empty string). I still do not know what is triggering it, but we already added the code which makes abrtd survive that (and log a lot more information about this bug, which hopefully will allow it to be caught). Can you retest with newer version? For F12, latest is: http://koji.fedoraproject.org/koji/buildinfo?buildID=156498
abrt-1.0.9-2.fc12.x86_64 Keeps counting kernel oops as a new crash showing the red icon in Gnome. In /var/cache/abrt (count has now reached 15 from only one oops) -rw-r--r-- 1 root root 8192 21 jun 10:38 abrt-db drwxr-x--- 2 abrt root 4096 21 jun 01:17 kerneloops-1277075799-1 -rw------- 1 root root 13 20 jun 19:18 last-ccpp kerneloops-1277075799-1 remains unchanged (it has been reported)
-rw-r--r-- 1 root root 8192 21 jun 18:40 abrt-db We have now reached 26 and still counting. Amazing how difficult it seems to be to insert if A = A then do nothing; ;-)
Hi, do you still see this problem on your machine with the latest abrt. J.
version 1.1.1-2.fc12 I haven't seen this problem recently
(In reply to comment #26) > version 1.1.1-2.fc12 > > I haven't seen this problem recently Thank you, closing.