A cronned job which calls clock_gettime(), which glibc does internally for in some places, is being killed by the kernel auditing code. This is a regression with U3, or maybe just that cron started enabling auditing in U3 so it's the first time I saw it. Linux emcee.cambridge.redhat.com 2.4.21-20.EL #1 Wed Aug 18 20:48:55 EDT 2004 x86_64 x86_64 x86_64 GNU/Linux $ cat clock.c #include <time.h> int main(int argc, char **argv) { struct timespec ts; return clock_gettime(CLOCK_REALTIME, &ts); } $ gcc -m32 -o clock clock.c -lrt $ su Password: $ ./clock $ echo $? 0 $ /usr/sbin/aurun ./clock Killed $ echo $? 137 $ dmesg | tail -1 audit_intercept: error 38, killing task if run under strace and not under aurun: clock_gettime(0, 0xfffe3f70) = -1 ENOSYS (Function not implemented)
entry.S compares against IA32_NR_syscalls before it enters the audit code. The audit x86_64 syscall code compares the syscall in the regs to see if it is >= IA32_NR_syscalls, this needs to be changed to > to stop audit from killing the process.
I do not beleive this to be a kernel regression.
I also hit this issue. In my case, it is a proprietary 32 bit application (IBM Tivoli Storage Manager) which is killed with the same error. The only fix is disabling the audit code, but in that case cron gives a lot of errors. Any timeline on a fix?
Any news on this issue? U4 is out and still didn't fix this problem.
Reassigning to Brian.
I also hit this issue, with update 5. Also a proprietary 32 bit application, Tibco Rendezvous. I get the same "audit_intercept: error 38, killing task" message.
Created attachment 118566 [details] I assume this patch takes care of what peter mentioned?
I can reproduce the exact same symptoms at will by simply calling statfs64() from a 32-bit program (compiled on RHEL 2.1 AS) run on 2.4.21-32.0.1.EL #1 Tue May 17 17:53:25 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux. It works from the command line, but under cron it's killed by the audit_intercept: error 38 code. I have my customer testing to see if the failure under cron can be reproduced at the command line using 'aurun'. Is there perhaps a more general problem with certain 32-bit syscalls (possibly in a given high numerical range) interacting poorly with the audit system on x86_64 systems?
Okay, following up on my own previous comment, here's what I've discovered in the 2.4.21-32.0.1.EL sources: arch/i386/kernel/entry.S defines all the syscalls. clock_gettime is 265, statfs is 268. There are a total of 270 syscalls. drivers/audit/syscall-x86_64.c uses IA32_NR_syscalls to set nr_syscalls and then does the (code_raw >= nr_syscalls) as noted previously in this bug, to decide whether or not to kill the process. The problem is that include/asm-x86_64/ia32_unistd.h has an unconditional #define of IA32_NR_syscalls to be 265. Due to this dichotomy, syscalls 265 through 270 will all get killed if called by a 32-bit process on a 64-bit kernel with auditing enabled. That's the real problem here. Can someone please investigate a fix?
I get the same problem with heartbeat + DRBD Oct 4 17:30:05 c-229 heartbeat: ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk Oct 4 17:30:06 c-229 heartbeat: info: Retrying failed stop operation [drbddisk::drbd_var_mqm] Oct 4 17:30:06 c-229 heartbeat: info: Running /etc/ha.d/resource.d/drbddisk drbd_var_mqm stop Oct 4 17:30:06 c-229 kernel: audit_intercept: error 14, killing task I cant switch resource group. Any fix for this? U6 fix this??
This issue is on Red Hat Engineering's list of planned work items for the upcoming Red Hat Enterprise Linux 3.8 release. Engineering resources have been assigned and barring unforeseen circumstances, Red Hat intends to include this item in the 3.8 release.
The information in comment #1 and the associated patch in comment #10 are bogus. The problem has already been correctly diagnosed in comment #12 (except for the fact that the syscall table has 271 entries, not 270). The correct fix is to make IA32_NR_syscalls in include/asm-x86_64/ia32_unistd.h match the value for NR_syscalls in include/linux/sys.h, which is 271. The associated table "ia32_sys_call_table" in arch/x86_64/ia32/ia32entry.S already contains 271 entries. I'll test a fix shortly and post it for consideration in U8. Thank you Michael Saletnik for your analysis.
Created attachment 127961 [details] fix for correcting IA32_NR_syscalls value on x86_64 The patch above has been verified to correct the problem and has been posted for review (with less than two hours to spare before the U8 patch posting deadline).
A fix for this problem has just been committed to the RHEL3 U8 patch pool this evening (in kernel version 2.4.21-40.10.EL).
A kernel has been released that contains a patch for this problem. Please verify if your problem goes away with the latest available kernel from the RHEL3 public beta channel at rhn.redhat.com.
Reverting to ON_QA.
Kernel 2.4.21-43.ELsmp x86_64 was tested (on an otherwise update 7 system) and indeed I have verified that the problem has gone away for our product with this kernel. We now run fine (as tested by using aurun) instead of being killed. Thank you for the fix!
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0437.html