Description of problem: logrotate abort with: error: stat of /var/log/boot.log failed: No such file or directory error: stat of /var/log/cron failed: No such file or directory *** glibc detected *** corrupted double-linked list: 0x0000003d5ed2f698 *** Aborted Version-Release number of selected component (if applicable): 3.7.2-7 How reproducible: Always. Steps to Reproduce: 1. logrotate -f /etc/logrotate.conf Actual results: Program aborts. Expected results: Should run? Additional info: This is logrotate SRPM compiled on x86_86 in RHEL U2, with rpmbuild --rebuild etc. GDB output: ----------------------------- (gdb) r Starting program: /usr/sbin/logrotate -f /etc/logrotate.conf (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) error: stat of /var/log/boot.log failed: No such file or directory error: stat of /var/log/cron failed: No such file or directory Detaching after fork from child process 9433. *** glibc detected *** corrupted double-linked list: 0x0000003d5ed2f698 *** Program received signal SIGABRT, Aborted. 0x0000003d5eb2e37d in raise () from /lib64/tls/libc.so.6 (gdb) bt #0 0x0000003d5eb2e37d in raise () from /lib64/tls/libc.so.6 #1 0x0000003d5eb2faae in abort () from /lib64/tls/libc.so.6 #2 0x0000003d5eb62be1 in __libc_message () from /lib64/tls/libc.so.6 #3 0x0000003d5eb67ee8 in malloc_consolidate () from /lib64/tls/libc.so.6 #4 0x0000003d5eb683fa in _int_free () from /lib64/tls/libc.so.6 #5 0x0000003d5eb689b6 in free () from /lib64/tls/libc.so.6 #6 0x0000000000404ea3 in main () -----------------------------
Created attachment 120110 [details] strace output
Created attachment 120111 [details] Configuration file
One more thing, so you don't spend too much time looking for a cause... Once the option "extension old" was put in, the program started crashing. With dateext, it worked fine.
could you install the debuginfo rpm as well? that makes the backtrace a lot more verbose/useful.
Seems to be caused by one patch submitted in bug 169888 (Comment #2, attachment id=119955), to free memory. From what understand from the source code, the problem is that logInfo structs may share data, as the example you can see below, but the code free all data for each logInfo struct individually, which results in freeing the same memory region two times or more. I have set itemPtr->extension = NULL after free'd it, for safety, but Valgrind, at least, still reports errors, so there something strange here, but I cannot take a look now because I am at work. # logrotate runs: defConfig->extension = NULL # let's find first log config i = 0 # config file parsing, first step is set default values... newlog = defConfig # suppose no extension parameter is found. # function returns values for first log config. (*logsPtr + i) = newlog # result for first log config: (*logsPtr + i)->extension == NULL # let's find second log config i = 1 # still "defConfig->extension == NULL" # looking in config again, preset with values of defConfig... newlog = defConfig # now suppose "extension old" parameter is found newlog->extension = strdup("old"); # assign new config to new log pointer: (*logsPtr + i) = newlog # now (*logsPtr + i)->extension == "old" # and because of "newlog = defConfig" above... defConfig->extension == "old" # now the third log config i = 2 # remember, defConfig->extension == "old" # looking in config again, use values of defConfig by default... newlog = defConfig # like 1st log config, suppose no extension parameter is found. # function returns. (*logsPtr + i) = newlog # result for third log file: (*logsPtr + i)->extension == "old" # this is because (*logsPtr + 2)->extension now points to the same address of defConfig->extension and (*logsPtr + 1)->extension # then when exiting.. free((*logsPtr + 0)->extension); /* ok, because == free(NULL); */ free((*logsPtr + 1)->extension); /* ok, because == free("old"); */ free((*logsPtr + 2)->extension); /* error, because (*logsPtr + 1)->extension was free'd and this points to the same address*/
Sorry, I will recompile and try again today with the -8.
Version -8 has the same problem. Just give me some hours, until I'm back at home, and I will provide a patch to fix this problem.
Here is that backtrace with debuginfo: ------------------------------------- galileo {root}# gdb /usr/sbin/logrotate GNU gdb Red Hat Linux (6.3.0.0-1.63rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db l ibrary "/lib64/tls/libthread_db.so.1". (gdb) set args -f /etc/logrotate.conf (gdb) r Starting program: /usr/sbin/logrotate -f /etc/logrotate.conf error: stat of /var/log/boot.log failed: No such file or directory error: stat of /var/log/cron failed: No such file or directory Detaching after fork from child process 12330. *** glibc detected *** corrupted double-linked list: 0x0000003d5ed2f698 *** Program received signal SIGABRT, Aborted. 0x0000003d5eb2e37d in raise () from /lib64/tls/libc.so.6 (gdb) bt #0 0x0000003d5eb2e37d in raise () from /lib64/tls/libc.so.6 #1 0x0000003d5eb2faae in abort () from /lib64/tls/libc.so.6 #2 0x0000003d5eb62be1 in __libc_message () from /lib64/tls/libc.so.6 #3 0x0000003d5eb67ee8 in malloc_consolidate () from /lib64/tls/libc.so.6 #4 0x0000003d5eb683fa in _int_free () from /lib64/tls/libc.so.6 #5 0x0000003d5eb689b6 in free () from /lib64/tls/libc.so.6 #6 0x0000000000404e93 in main (argc=Variable "argc" is not available. ) at logrotate.c:97 #7 0x0000003d5eb1c4bb in __libc_start_main () from /lib64/tls/libc.so.6 #8 0x000000000040213a in _start () #9 0x0000007fbffff9e8 in ?? () #10 0x000000000000001c in ?? () #11 0x0000000000000003 in ?? () #12 0x0000007fbffffbea in ?? () #13 0x0000007fbffffbfe in ?? () #14 0x0000007fbffffc01 in ?? () #15 0x0000000000000000 in ?? () -------------------------------------
BTW, above BT was for -8, which obviously still has a problem, as Mateus already pointed out.
I'm being rude... Thanks Mateus for working on this problem.
Created attachment 120179 [details] First attempt to fix segfault of free_single_loginfo_item() I am sending a patch to fix this problem. Basically, it just check if the address of the string was already free'd before free the current. I am not satisfied with it, because there are repetitive blocks of code, but I am failing to get a working helper function for use in this case. Could someone help me improve this patch?
Given that the changing bit is a lexical after the itemPtr (ie. itemPtr->pattern, itemPtr->oldDir, etc.), maybe you can have a macro instead. Something like: #define freeItem(what) \ if (itemPtr->what) { \ for (j = 0; j < i; j++) \ if (itemPtr->what == (*logsPtr + j)->what) \ break; \ if (j == i) \ free(itemPtr->what); \ } And then just have: freeItem(pattern); freeItem(oldDir); And so on... I did't actually test this. Just a suggestion....
Created attachment 120188 [details] New version of the patch, using macro. Thanks for your suggestion, Bojan. I was thinking about a function, but a macro helps a lot.
Thx. Mateus. Guys, check if http://people.redhat.com/pvrabec/rpms/logrotate-3.7.2-9.src.rpm works, please. I can't reproduce the problem right now.
It doesn't segfault any more, so that's good news. However, if I have: extension old in my config file, it doesn't actually create the rotated files with that extension, but rather just creates them as numbers. I was under the impression that I should get: <filename>.old.1 or <filename>.old.1.gz if compression is enabled. Or maybe I misread this option? The dateext option works fine though, with and without compression.
One more thing... The option gets picked up by config.c (here is that part of the debug output): extension is now old However, it doesn't get used later on...
If u use "extension .old" than file "somefile.old" is rotated to "somefile.1.old"