Description of problem: file(1) very often segfaults. Version-Release number of selected component (if applicable): 4.10-1 How reproducible: Easy to reproduce, but not predictable. Steps to Reproduce: while file /usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp; do :; done Actual results: After about 100 runs or so: [...] /usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp: data Segmentation fault (core dumped) Expected results: No crashes. Additional info: I've verified that this happens on two separate machines.
I've used your test code, even did my own one with counter. Runs for 1000times without any segfault. May I have your kernel version? Can you also check maximum open files on your system?
$ uname -r 2.6.8-1.535 $ ulimit -n 1024 I've had confirmation on #fedora-devel that others can reproduce it.
Are you running on x86? Perhaps it's an arch-specific bug. I am running i686. It isn't anything specific to that particular file; it happens with all sorts of 'data' files. I see file(1) fail about 4 or 5 times at the end of a tetex package build. Here I've seen it fail with both 2.6.7-1.503 and 2.6.8-1.535, so I don't think it's a kernel issue. I'll get you a backtrace..
(gdb) bt #0 0x00a7eef5 in file_softmagic (ms=0xa010830, buf=0xfeeca6c0 "", nbytes=392) at softmagic.c:70 #1 0x00a82c19 in file_buffer (ms=0xa010830, buf=0xfeeca6c0, nb=392) at funcs.c:123 #2 0x00a7b332 in magic_file (ms=0xa010830, inname=0xfefa05c7 "/usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp") at magic.c:281 #3 0x08048d0c in process ( inname=0xfefa05c7 "/usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp", wid=0) at file.c:388 #4 0x080491b8 in main (argc=2, argv=0xfeedb874) at file.c:315 #5 0x00967a93 in __libc_start_main () from /lib/tls/libc.so.6 #6 0x08048b11 in _start () (gdb) info locals ml = (struct mlist *) 0x110c8 (gdb) p *ml Cannot access memory at address 0x110c8
Still fails when built with -O0, and having MALLOC_CHECK_=1 set does not give any indication of memory problems. It won't seem to fail when linked against efence. valgrind didn't show any problems. Compiling and running with mudflap (gcc35 -fmudflap -lmudflap) didn't find any problems. Good luck finding this one. :-)
Clarification: the only time that file(1) did *not* fail was when linked against efence. Maybe that's useful information for tracking it down.
$ while file /usr/share/texmf/omega/ocp/char2uni/inkoi8.ocp; do :; done|wc -l 83 segfaults... selinux turned on in targetted mode.. [harald@jever ~]$ uname -r 2.6.8-1.541smp [harald@jever ~]$ rpm -q glibc file glibc-2.3.3-47 file-4.10-1 $ uname -m i686
I also have glibc-2.3.3-47 (i686).
Try this: echo 1 > /proc/sys/kernel/exec-shield-randomize
After doing this: echo 1 > /proc/sys/kernel/print-fatal-signals and re-running file until segfault, I get this from dmesg: file/30580: potentially unexpected fatal signal 11. code at 008f0ef5: 8b 52 0c 39 11 89 55 c4 0f 85 c9 fe ff ff 83 c4 Pid: 30580, comm: file EIP: 0073:[<008f0ef5>] CPU: 0 EIP is at 0x8f0ef5 ESP: 007b:fef95a40 EFLAGS: 00210246 Not tainted (2.6.8-1.535) EAX: 00000000 EBX: 008f72d8 ECX: 0a02e830 EDX: 0002f0c8 ESI: 00000001 EDI: f6de9500 EBP: fef95ac8 DS: 007b ES: 007b CR0: 8005003b CR2: 0002f0d4 CR3: 003c0000 CR4: 000006d0
I can't seem to get it to fail under valgrind either. Presumably valgrind and ElectricFence both bypass the exec-shield randomized base address for the programs they examine.
Found and fixed this bug. 4.10-2.
Patch sent upstream.
*** Bug 140720 has been marked as a duplicate of this bug. ***