Description of problem: slocate (updatedb) causes (gdb) run Starting program: /usr/bin/updatedb Reading symbols from shared object read from target memory...done. Loaded system supplied DSO at 0xffffe000 *** glibc detected *** /usr/bin/updatedb: corrupted double-linked list: 0x0805e8e0 *** ======= Backtrace: ========= /lib/libc.so.6[0x41a093ec] /lib/libc.so.6(__libc_free+0x6b)[0x41a0988f] /usr/bin/updatedb(fts_close+0x44)[0x804cc70] /usr/bin/updatedb[0x804b37b] /lib/libc.so.6(__libc_start_main+0xb3)[0x419bedfb] /usr/bin/updatedb[0x80493f1] ======= Memory map: ======== 08047000-08050000 r-xp 00000000 03:02 349837 /usr/bin/slocate 08050000-08051000 rw-p 00009000 03:02 349837 /usr/bin/slocate 08051000-08078000 rw-p 08051000 00:00 0 [heap] ...bla bla... bfad1000-bfae6000 rw-p bfad1000 00:00 0 [stack] ffffe000-fffff000 ---p 00000000 00:00 0 [vdso] Program received signal SIGABRT, Aborted. 0x419d12c5 in raise () from /lib/libc.so.6 (gdb) where #0 0x419d12c5 in raise () from /lib/libc.so.6 #1 0x419d27eb in abort () from /lib/libc.so.6 #2 0x41a039c1 in __libc_message () from /lib/libc.so.6 #3 0x41a093ec in _int_free () from /lib/libc.so.6 #4 0x41a0988f in free () from /lib/libc.so.6 #5 0x0804cc70 in fts_close (sp=0x8053780) at sl_fts.c:244 #6 0x0804b37b in create_db (dirstr=0x0) at main.c:1072 #7 0x419bedfb in __libc_start_main () from /lib/libc.so.6 #8 0x080493f1 in _start () at this file /proc/net/rpc/flush (yeah, missing proc in updatedb.conf) Version-Release number of selected component (if applicable): 2.7-22 Additional info: Is there any good reason NOT to add proc and sysfs fs to PRUNEFS into rpm updatedb.conf?
Created attachment 117277 [details] Patch you can apply to updatedb.conf, maybe by hand it may be faster
Thanks for your report. proc, sysfs and similar filesystems are excluded in /etc/cron.daily/slocate.cron; that code will handle any new nodev filesystems used by the kernel, so adding proc to /etc/updatedb.conf is just redundant. Rather than avoiding the problem, the crash should be fixed; I'll try to find the cause.
(In reply to comment #2) > proc, sysfs and similar filesystems are excluded in /etc/cron.daily/slocate.cron; > that code will handle any new nodev filesystems used by the kernel, so > adding proc to /etc/updatedb.conf is just redundant. Results of updatedb by hand and by cron should be the same. Maybe it could be better to write script updatedb instead of symlinking slocate, that will control /proc/filesystems as well. Just a thought... > > Rather than avoiding the problem, the crash should be fixed; I'll try to find > the cause. The problem is here main.c:1093: fts_close(dir); but i don't have now time to debug (??glibc bug??).
I can't reproduce the problem so far. - What does "at this file /proc/net/rpc/flush" mean? - Does the crash happen every time? - Does (rpm -V slocate) report anything? - If the crash is reproducible, can you please install the debuginfo package and run (valgrind --tool=memcheck --num-callers=20 updatedb) ?.
(In reply to comment #4) > I can't reproduce the problem so far. > - What does "at this file /proc/net/rpc/flush" mean? I don't know now, it isn't there. > - Does the crash happen every time? Yeah, it did (on the same file ?or dir?). > - Does (rpm -V slocate) report anything? no, rpm was ok, i tried my own compiled slocate and run it from . without results. > - If the crash is reproducible, can you please install the debuginfo package > and run (valgrind --tool=memcheck --num-callers=20 updatedb) ?. No, it isn't now. Sorry for wasting your time, maybe it could be a kernel bug: find: WARNING: Hard link count is wrong for /proc: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched. I'll dig into the problem and let you know.
Thanks for your quick reply. (In reply to comment #5) > > - What does "at this file /proc/net/rpc/flush" mean? > I don't know now, it isn't there. That's why I am asking :) > > - Does the crash happen every time? > Yeah, it did (on the same file ?or dir?). How do you know it happens on the same file? > Sorry for wasting your time, maybe it could be a kernel bug: > find: WARNING: Hard link count is wrong for /proc: this may be a bug in your > filesystem driver. Automatically turning on find's -noleaf option. Earlier > results may have failed to include directories that should have been searched. This could be related, but slocate still should not crash. Nevertheless, slocate runs fine here even though find reports the same warning. > I'll dig into the problem and let you know. Thanks, please reset the bug to the ASSIGNED state when you discover anything.
(In reply to comment #6) > > > - Does the crash happen every time? > > Yeah, it did (on the same file ?or dir?). > How do you know it happens on the same file? updatedb -v started 5 times (and it stopped on the same file or dir or what was it) > > Sorry for wasting your time, maybe it could be a kernel bug: > > find: WARNING: Hard link count is wrong for /proc: this may be a bug in your > > filesystem driver. Automatically turning on find's -noleaf option. Earlier > > results may have failed to include directories that should have been searched. > This could be related, but slocate still should not crash. Nevertheless, > slocate runs fine here even though find reports the same warning. Maybe there could be some "bigger" underrun if the file was there (nfs creates it in some cases: drivers/net/sunrpc/cache.c, but i don't when). Now the warning is here, but updatedb runs ok, as you write. > > I'll dig into the problem and let you know. > Thanks, please reset the bug to the ASSIGNED state when you discover anything. It was a mistake, i didn't want to change the state :(.
(In reply to comment #7) > (In reply to comment #6) > > > > - Does the crash happen every time? > > > Yeah, it did (on the same file ?or dir?). > > How do you know it happens on the same file? > updatedb -v > started 5 times (and it stopped on the same file or dir or what was it) Thanks, that was missing from the original description. > Maybe there could be some "bigger" underrun if the file was there (nfs creates > it in some cases: drivers/net/sunrpc/cache.c, but i don't when). > Now the warning is here, but updatedb runs ok, as you write. linux/net/sunrpc/cache.c:cache_register() can AFAICS only create /proc/net/rpc/$something/flush . > > > I'll dig into the problem and let you know. > > Thanks, please reset the bug to the ASSIGNED state when you discover anything. > It was a mistake, i didn't want to change the state :(. It probably wasn't your mistake then; bugzilla might be doing it automatically. One more question: at the time updatedb was crashing, were there any very deep directory hierarchies (absolute path names longer than roughly 4350 bytes) on the filesystem?
(In reply to comment #8) > linux/net/sunrpc/cache.c:cache_register() can AFAICS only create > /proc/net/rpc/$something/flush . It is possible, but I use recent mm kernels for testing and there could be a bug. I think, that I haven't used nfs, so I really don't know what the flush was :(. > One more question: at the time updatedb was crashing, were there any > very deep directory hierarchies (absolute path names longer than roughly 4350 > bytes) on the filesystem? I don't think so, I didn't make any big changes and now I have only one file, that has approx. 250 chars in its abs path.
The new slocate-2.7-22.fc4.1 package fixes several bugs that could cause similar crashes. If you find a way to reproduce the original problem (with either -22 or -22.fc4.1), please reopen this bug.