131892 – file(1) segfaults

Bug 131892 - file(1) segfaults

Summary: file(1) segfaults

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	file
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Radek Vokál
QA Contact:	Mike McLean
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	140720 (view as bug list)
Depends On:
Blocks:	FC3Target
TreeView+	depends on / blocked

Reported:	2004-09-06 12:37 UTC by Tim Waugh
Modified:	2007-11-30 22:10 UTC (History)
CC List:	4 users (show)
Fixed In Version:	4.10-2
Clone Of:
Environment:
Last Closed:	2004-10-12 11:50:49 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Tim Waugh 2004-09-06 12:37:12 UTC

Description of problem:
file(1) very often segfaults.

Version-Release number of selected component (if applicable):
4.10-1

How reproducible:
Easy to reproduce, but not predictable.

Steps to Reproduce:
while file /usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp; do :; done
  
Actual results:
After about 100 runs or so:

[...]
/usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp: data
Segmentation fault (core dumped)

Expected results:
No crashes.

Additional info:
I've verified that this happens on two separate machines.

Comment 1 Radek Vokál 2004-09-07 06:57:53 UTC

I've used your test code, even did my own one with counter. Runs for
1000times without any segfault. May I have your kernel version? Can
you also check maximum open files on your system?

Comment 2 Tim Waugh 2004-09-07 08:38:31 UTC

$ uname -r
2.6.8-1.535
$ ulimit -n
1024

I've had confirmation on #fedora-devel that others can reproduce it.

Comment 4 Tim Waugh 2004-09-07 09:46:04 UTC

Are you running on x86?  Perhaps it's an arch-specific bug.  I am
running i686.

It isn't anything specific to that particular file; it happens with
all sorts of 'data' files.  I see file(1) fail about 4 or 5 times at
the end of a tetex package build.

Here I've seen it fail with both 2.6.7-1.503 and 2.6.8-1.535, so I
don't think it's a kernel issue.

I'll get you a backtrace..

Comment 5 Tim Waugh 2004-09-07 09:47:56 UTC

(gdb) bt
#0  0x00a7eef5 in file_softmagic (ms=0xa010830, buf=0xfeeca6c0 "",
nbytes=392)
    at softmagic.c:70
#1  0x00a82c19 in file_buffer (ms=0xa010830, buf=0xfeeca6c0, nb=392)
    at funcs.c:123
#2  0x00a7b332 in magic_file (ms=0xa010830,
    inname=0xfefa05c7 "/usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp")
    at magic.c:281
#3  0x08048d0c in process (
    inname=0xfefa05c7
"/usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp", wid=0)
    at file.c:388
#4  0x080491b8 in main (argc=2, argv=0xfeedb874) at file.c:315
#5  0x00967a93 in __libc_start_main () from /lib/tls/libc.so.6
#6  0x08048b11 in _start ()
(gdb) info locals
ml = (struct mlist *) 0x110c8
(gdb) p *ml
Cannot access memory at address 0x110c8

Comment 6 Tim Waugh 2004-09-07 10:21:10 UTC

Still fails when built with -O0, and having MALLOC_CHECK_=1 set does
not give any indication of memory problems.

It won't seem to fail when linked against efence.

valgrind didn't show any problems.

Compiling and running with mudflap (gcc35 -fmudflap -lmudflap) didn't
find any problems.

Good luck finding this one. :-)

Comment 7 Tim Waugh 2004-09-07 10:21:57 UTC

Clarification: the only time that file(1) did *not* fail was when
linked against efence.  Maybe that's useful information for tracking
it down.

Comment 10 Harald Hoyer 2004-09-07 12:35:56 UTC

$  while file /usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp; do :;
done|wc -l
83
segfaults...

selinux turned on in targetted mode..
[harald@jever ~]$ uname -r
2.6.8-1.541smp
[harald@jever ~]$ rpm -q glibc file
glibc-2.3.3-47
file-4.10-1
$ uname -m
i686

Comment 11 Tim Waugh 2004-09-07 12:37:06 UTC

I also have glibc-2.3.3-47 (i686).

Comment 12 Tim Waugh 2004-09-07 12:39:43 UTC

Try this:

echo 1 > /proc/sys/kernel/exec-shield-randomize

Comment 13 Tim Waugh 2004-09-07 12:45:36 UTC

After doing this:

echo 1 > /proc/sys/kernel/print-fatal-signals

and re-running file until segfault, I get this from dmesg:

file/30580: potentially unexpected fatal signal 11.
code at 008f0ef5: 8b 52 0c 39 11 89 55 c4 0f 85 c9 fe ff ff 83 c4

Pid: 30580, comm:                 file
EIP: 0073:[<008f0ef5>] CPU: 0
EIP is at 0x8f0ef5
 ESP: 007b:fef95a40 EFLAGS: 00210246    Not tainted  (2.6.8-1.535)
EAX: 00000000 EBX: 008f72d8 ECX: 0a02e830 EDX: 0002f0c8
ESI: 00000001 EDI: f6de9500 EBP: fef95ac8 DS: 007b ES: 007b
CR0: 8005003b CR2: 0002f0d4 CR3: 003c0000 CR4: 000006d0

Comment 14 Tim Waugh 2004-10-12 09:09:39 UTC

I can't seem to get it to fail under valgrind either.  Presumably
valgrind and ElectricFence both bypass the exec-shield randomized base
address for the programs they examine.

Comment 15 Tim Waugh 2004-10-12 11:50:49 UTC

Found and fixed this bug.  4.10-2.

Comment 16 Tim Waugh 2004-10-12 14:44:17 UTC

Patch sent upstream.

Comment 17 Radek Vokál 2004-11-24 14:33:05 UTC

*** Bug 140720 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.