Bug 164637

Summary: Slocate error with proc fs
Product: [Fedora] Fedora Reporter: Jiri Slaby <jirislaby>
Component: slocateAssignee: Miloslav Trmač <mitr>
Status: CLOSED ERRATA QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: high    
Version: 4CC: jaboutbo
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.7-22.fc4.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-08-23 12:55:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch you can apply to updatedb.conf, maybe by hand it may be faster none

Description Jiri Slaby 2005-07-29 15:53:14 UTC
Description of problem:
slocate (updatedb) causes
(gdb) run
Starting program: /usr/bin/updatedb 
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xffffe000
*** glibc detected *** /usr/bin/updatedb: corrupted double-linked list:
0x0805e8e0 ***
======= Backtrace: =========
/lib/libc.so.6[0x41a093ec]
/lib/libc.so.6(__libc_free+0x6b)[0x41a0988f]
/usr/bin/updatedb(fts_close+0x44)[0x804cc70]
/usr/bin/updatedb[0x804b37b]
/lib/libc.so.6(__libc_start_main+0xb3)[0x419bedfb]
/usr/bin/updatedb[0x80493f1]
======= Memory map: ========
08047000-08050000 r-xp 00000000 03:02 349837     /usr/bin/slocate
08050000-08051000 rw-p 00009000 03:02 349837     /usr/bin/slocate
08051000-08078000 rw-p 08051000 00:00 0          [heap]
...bla bla...
bfad1000-bfae6000 rw-p bfad1000 00:00 0          [stack]
ffffe000-fffff000 ---p 00000000 00:00 0          [vdso]

Program received signal SIGABRT, Aborted.
0x419d12c5 in raise () from /lib/libc.so.6
(gdb) where
#0  0x419d12c5 in raise () from /lib/libc.so.6
#1  0x419d27eb in abort () from /lib/libc.so.6
#2  0x41a039c1 in __libc_message () from /lib/libc.so.6
#3  0x41a093ec in _int_free () from /lib/libc.so.6
#4  0x41a0988f in free () from /lib/libc.so.6
#5  0x0804cc70 in fts_close (sp=0x8053780) at sl_fts.c:244
#6  0x0804b37b in create_db (dirstr=0x0) at main.c:1072
#7  0x419bedfb in __libc_start_main () from /lib/libc.so.6
#8  0x080493f1 in _start ()

at this file /proc/net/rpc/flush (yeah, missing proc in updatedb.conf)

Version-Release number of selected component (if applicable):
2.7-22

Additional info:
Is there any good reason NOT to add proc and sysfs fs to PRUNEFS into rpm
updatedb.conf?

Comment 1 Jiri Slaby 2005-07-29 15:53:14 UTC
Created attachment 117277 [details]
Patch you can apply to updatedb.conf, maybe by hand it may be faster

Comment 2 Miloslav Trmač 2005-08-01 13:36:17 UTC
Thanks for your report.

proc, sysfs and similar filesystems are excluded in /etc/cron.daily/slocate.cron;
that code will handle any new nodev filesystems used by the kernel, so
adding proc to /etc/updatedb.conf is just redundant.

Rather than avoiding the problem, the crash should be fixed; I'll try to find
the cause.

Comment 3 Jiri Slaby 2005-08-01 16:10:58 UTC
(In reply to comment #2)
> proc, sysfs and similar filesystems are excluded in /etc/cron.daily/slocate.cron;
> that code will handle any new nodev filesystems used by the kernel, so
> adding proc to /etc/updatedb.conf is just redundant.
Results of updatedb by hand and by cron should be the same. Maybe it could be
better to write script updatedb instead of symlinking slocate, that will control
/proc/filesystems as well.

Just a thought...
> 
> Rather than avoiding the problem, the crash should be fixed; I'll try to find
> the cause.
The problem is here
main.c:1093:       fts_close(dir);
but i don't have now time to debug (??glibc bug??).


Comment 4 Miloslav Trmač 2005-08-03 23:54:20 UTC
I can't reproduce the problem so far.
- What does "at this file /proc/net/rpc/flush" mean?
- Does the crash happen every time?
- Does (rpm -V slocate) report anything?
- If the crash is reproducible, can you please install the debuginfo package
  and run (valgrind --tool=memcheck --num-callers=20 updatedb) ?.

Comment 5 Jiri Slaby 2005-08-04 00:24:26 UTC
(In reply to comment #4)
> I can't reproduce the problem so far.
> - What does "at this file /proc/net/rpc/flush" mean?
I don't know now, it isn't there.

> - Does the crash happen every time?
Yeah, it did (on the same file ?or dir?).

> - Does (rpm -V slocate) report anything?
no, rpm was ok, i tried my own compiled slocate and run it from . without results.

> - If the crash is reproducible, can you please install the debuginfo package
>   and run (valgrind --tool=memcheck --num-callers=20 updatedb) ?.
No, it isn't now.

Sorry for wasting your time, maybe it could be a kernel bug:
find: WARNING: Hard link count is wrong for /proc: this may be a bug in your
filesystem driver.  Automatically turning on find's -noleaf option.  Earlier
results may have failed to include directories that should have been searched.

I'll dig into the problem and let you know.

Comment 6 Miloslav Trmač 2005-08-04 00:34:13 UTC
Thanks for your quick reply.

(In reply to comment #5)
> > - What does "at this file /proc/net/rpc/flush" mean?
> I don't know now, it isn't there.
That's why I am asking :)

> > - Does the crash happen every time?
> Yeah, it did (on the same file ?or dir?).
How do you know it happens on the same file?

> Sorry for wasting your time, maybe it could be a kernel bug:
> find: WARNING: Hard link count is wrong for /proc: this may be a bug in your
> filesystem driver.  Automatically turning on find's -noleaf option.  Earlier
> results may have failed to include directories that should have been searched.
This could be related, but slocate still should not crash.  Nevertheless,
slocate runs fine here even though find reports the same warning.

> I'll dig into the problem and let you know.
Thanks, please reset the bug to the ASSIGNED state when you discover anything.

Comment 7 Jiri Slaby 2005-08-04 00:58:06 UTC
(In reply to comment #6)
> > > - Does the crash happen every time?
> > Yeah, it did (on the same file ?or dir?).
> How do you know it happens on the same file?
updatedb -v
started 5 times (and it stopped on the same file or dir or what was it)

> > Sorry for wasting your time, maybe it could be a kernel bug:
> > find: WARNING: Hard link count is wrong for /proc: this may be a bug in your
> > filesystem driver.  Automatically turning on find's -noleaf option.  Earlier
> > results may have failed to include directories that should have been searched.
> This could be related, but slocate still should not crash.  Nevertheless,
> slocate runs fine here even though find reports the same warning.
Maybe there could be some "bigger" underrun if the file was there (nfs creates
it in some cases: drivers/net/sunrpc/cache.c, but i don't when).
Now the warning is here, but updatedb runs ok, as you write.

> > I'll dig into the problem and let you know.
> Thanks, please reset the bug to the ASSIGNED state when you discover anything.
It was a mistake, i didn't want to change the state :(.


Comment 8 Miloslav Trmač 2005-08-04 18:04:24 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > > > - Does the crash happen every time?
> > > Yeah, it did (on the same file ?or dir?).
> > How do you know it happens on the same file?
> updatedb -v
> started 5 times (and it stopped on the same file or dir or what was it)
Thanks, that was missing from the original description.

> Maybe there could be some "bigger" underrun if the file was there (nfs creates
> it in some cases: drivers/net/sunrpc/cache.c, but i don't when).
> Now the warning is here, but updatedb runs ok, as you write.
linux/net/sunrpc/cache.c:cache_register() can AFAICS only create
/proc/net/rpc/$something/flush .

> > > I'll dig into the problem and let you know.
> > Thanks, please reset the bug to the ASSIGNED state when you discover anything.
> It was a mistake, i didn't want to change the state :(.
It probably wasn't your mistake then; bugzilla might be doing it automatically.


One more question: at the time updatedb was crashing, were there any
very deep directory hierarchies (absolute path names longer than roughly 4350
bytes) on the filesystem?


Comment 9 Jiri Slaby 2005-08-04 18:51:06 UTC
(In reply to comment #8)
> linux/net/sunrpc/cache.c:cache_register() can AFAICS only create
> /proc/net/rpc/$something/flush .
It is possible, but I use recent mm kernels for testing and there could be a bug.
I think, that I haven't used nfs, so I really don't know what the flush was :(.

> One more question: at the time updatedb was crashing, were there any
> very deep directory hierarchies (absolute path names longer than roughly 4350
> bytes) on the filesystem?
I don't think so, I didn't make any big changes and now I have only one file,
that has approx. 250 chars in its abs path.

Comment 10 Miloslav Trmač 2005-08-23 12:55:38 UTC
The new slocate-2.7-22.fc4.1 package fixes several bugs that could cause similar
crashes.

If you find a way to reproduce the original problem (with either -22 or
-22.fc4.1), please reopen this bug.