+++ This bug was initially created as a clone of Bug #193808 +++ -- Additional comment from xxx on 2006-06-01 14:51 EST -- OK this reproduces easily and is obviously a bug and you don't need any fancy setup. Pick two root processes. Then as a normal user do something like: [ben@quince tmp]$ strace -ff -o output -p 2715 -p 2738 attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted *** glibc detected *** double free or corruption (top): 0x0951a7a8 *** Aborted The problem is tied to the -ff option. If you don't have the -ff option it doesn't happen. Should be an easy fix. I think this has got to be one of the cases where no one has ever tried to use double -p options with -ff before. -- Additional comment from xxx on 2006-06-01 14:51 EST -- [ben@quince tmp]$ gdb strace GNU gdb Red Hat Linux (6.3.0.0-1.96rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) set args -ff -o output -p 2715 -p 2738 (gdb) run Starting program: /usr/bin/strace -ff -o output -p 2715 -p 2738 attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted *** glibc detected *** double free or corruption (top): 0x081c67a8 *** Program received signal SIGABRT, Aborted. 0x00aa47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) bt #0 0x00aa47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00ae47f5 in raise () from /lib/tls/libc.so.6 #2 0x00ae6199 in abort () from /lib/tls/libc.so.6 #3 0x00b184ea in __libc_message () from /lib/tls/libc.so.6 #4 0x00b1ec6f in _int_free () from /lib/tls/libc.so.6 #5 0x00b1efea in free () from /lib/tls/libc.so.6 #6 0x00b0f516 in fclose@@GLIBC_2.1 () from /lib/tls/libc.so.6 #7 0x08049880 in droptcb (tcp=0x81c611c) at strace.c:1163 #8 0x0804b26e in main (argc=8, argv=0xbfe9cf24) at strace.c:473 #9 0x00ad1e23 in __libc_start_main () from /lib/tls/libc.so.6 #10 0x080494c1 in _start () (gdb) -- Additional comment from xxx on 2006-06-01 14:51 EST -- The problem appears to be this section of code. 368 else if ((outf = fopen(outfname, "w")) == NULL) { 369 fprintf(stderr, "%s: can't fopen '%s': %s\\n", 370 progname, outfname, strerror(errno)); 371 exit(1); 372 } 373 It needs some logic to handle the multiple -f options like handing it off to a pipe does: 354 if (followfork > 1) { 355 fprintf(stderr, "\\ 356 %s: piping the output and -ff are mutually exclusive options\\n", 357 progname); 358 exit(1); 359 } -- Additional comment from xxx on 2006-06-01 14:51 EST -- That is not to say that it is sufficient to just bomb out like the pipe option does. It means that the part of the code that opens the up the filehandles and sticks the outf into the tcp structure needs to be iterated through. -- Additional comment from xxx on 2006-06-01 14:51 EST -- File uploaded: strace-ff.patch -- Additional comment from xxx on 2006-06-01 14:51 EST -- Basically, I felt like the temptation to just implement a patch which says, "don't do that" was so great, that I felt that by writing the patch myself, I could reduce the likelyhood that that would happen. Thus here is the patch. It probably could use some additional testing but it does seem to work on my system. Can we send this up to engineering now? I think that there is a high probability that upstream will need pretty much the same patch. -- Additional comment from woodard on 2006-06-01 15:05 EST -- Created an attachment (id=130359) fixes the problem for me. This fixes the problem for me. It probably could use a bit more testing than I gave it though.
Created attachment 142824 [details] strace-ff.patch The same patch once again.
*** Bug 218461 has been marked as a duplicate of this bug. ***
Indeed, strace -ff -o pathname -p pid1 -p pid2 is broken, it will crash on double fclose no matter what pid1 and pid2 values are. The strace-ff.patch is not complete, it always creates filename.0 and does not cover the case when pid defines a process with threads. I'll post another patch for review.
Created attachment 142909 [details] strace-ff-o.patch This patch changes strace -ff -o behaviour according to strace manpage. With this patch applied: $ rm -f log* ; strace -ff -o log sleep 1; ls -log log* -rw-r--r-- 1 1739 Dec 6 03:37 log.822 $ rm -f log* ; sleep 1& pid=$! ; strace -ff -o log -p $pid ; ls -log log* [1] 825 Process 825 attached - interrupt to quit Process 825 detached [1]+ Done sleep 1 -rw-r--r-- 1 145 Dec 6 03:37 log.825 $ rm -f log* ; sleep 1& pid1=$! ; sleep 1& pid2=$! ; strace -ff -o log -p $pid1 -p $pid2 ; ls -log log* [1] 829 [2] 830 Process 829 attached - interrupt to quit Process 830 attached - interrupt to quit Process 829 detached Process 830 detached [1]- Done sleep 1 [2]+ Done sleep 1 -rw-r--r-- 1 151 Dec 6 03:37 log.829 -rw-r--r-- 1 126 Dec 6 03:37 log.830 $ rm -f log* ; sh -c 'sleep 1& sleep 1'& pid1=$! ; sh -c 'sleep 1& sleep 1'& pid2=$! ; strace -ff -o log -p $pid1 -p $pid2 ; ls -log log* [1] 1234 [2] 1235 Process 1234 attached - interrupt to quit Process 1235 attached - interrupt to quit Process 1238 attached Process 1234 suspended Process 1239 attached Process 1240 attached Process 1235 suspended Process 1234 resumed Process 1238 detached Process 1234 detached [1]- Done sh -c 'sleep 1& sleep 1' Process 1235 resumed Process 1239 detached Process 1240 detached Process 1235 detached [2]+ Done sh -c 'sleep 1& sleep 1' -rw-r--r-- 1 1954 Dec 6 04:33 log.1234 -rw-r--r-- 1 6107 Dec 6 04:33 log.1235 -rw-r--r-- 1 2205 Dec 6 04:33 log.1238 -rw-r--r-- 1 2824 Dec 6 04:33 log.1239 -rw-r--r-- 1 2168 Dec 6 04:33 log.1240 Without this patch applied: $ rm -f log* ; strace -ff -o log sleep 1; ls -log log* -rw-r--r-- 1 1741 Dec 6 04:20 log $ rm -f log* ; sleep 1& pid=$! ; strace -ff -o log -p $pid ; ls -log log* [1] 1121 Process 1121 attached - interrupt to quit Process 1121 detached [1]+ Done sleep 1 -rw-r--r-- 1 145 Dec 6 04:20 log $ rm -f log* ; sleep 1& pid1=$! ; sleep 1& pid2=$! ; strace -ff -o log -p $pid1 -p $pid2 ; ls -log log* [1] 1125 [2] 1126 Process 1125 attached - interrupt to quit Process 1126 attached - interrupt to quit Process 1125 detached Process 1126 detached *** glibc detected *** strace: double free or corruption (top): 0x000000000064de10 *** [1]- Done sleep 1 [2]+ Done sleep 1 [...] Aborted -rw-r--r-- 1 233 Dec 6 04:20 log
Fixed upstream.
these bugs are fixed upstream in the coming 4.5.15 release
4.5.15 in rawhide and in updates for fc5 and fc6 fixes this.