Description of problem: After applying the vixie-cron-4.1-36.FC4 upgrade, vixie-cron fails with | Jul 14 13:12:03 news@londo cron.err (/usr/sbin/crond) /usr/sbin/crond[20060]: System error | Jul 14 13:12:03 news@londo auth.info (crond(pam_unix)) crond(pam_unix)[20060]: session closed for user news From time to time, I see messages like | Jul 14 13:01:03 localhost@kosh kernel: audit(1121338863.675:0): user pid=29532 uid=0 length=96 loginuid=4294967295 msg='PAM setcred: user=root exe="/usr/sbin/crond" (hostname=?, addr=?, terminal=cron result=Success)' | Jul 14 13:07:16 localhost@londo kernel: audit(1121339236.905:0): user pid=19113 uid=0 length=96 loginuid=4294967295 msg='PAM authentication: user=ensc exe="/bin/su" (hostname=?, addr=?, terminal=pts/6 result=Success)' | Jul 14 13:07:17 localhost@londo kernel: audit(1121339237.146:0): user pid=19113 uid=0 length=92 loginuid=4294967295 msg='PAM accounting: user=ensc exe="/bin/su" (hostname=?, addr=?, terminal=pts/6 result=Success)' also. Things worked fine until | Jul 14 07:28:50 Updated: audit-libs.i386 0.9.15-1.FC4 | Jul 14 07:28:52 Updated: pam.i386 0.79-9.1 | Jul 14 07:28:53 Updated: procps.i386 3.2.5-6.3 | Jul 14 07:28:54 Updated: krb5-libs.i386 1.4.1-5 | Jul 14 07:28:54 Updated: popt.i386 1.10.1-22 | Jul 14 07:28:54 Updated: vixie-cron.i386 4:4.1-36.FC4 were updated. Version-Release number of selected component (if applicable): vixie-cron-4.1-36.FC4 pam-0.79-9.1 audit-libs-0.9.15-1.FC4 How reproducible: 100% Additional info: SELinux is disabled on the kernel CLI
Created attachment 116742 [details] /etc/pam.d/system-auth
Created attachment 116743 [details] 'strace -f -p <pid-of-cron>' output strace output created by a | * * * * * root /bin/true line
The pam_setcred library call is failing, which causes this problem - it appears to be a problem with the audit or pam packages - I've CC'ed their maintainers on this bug report. The change in vixie-cron-36.FC4 to cause this is its new use of the pam_loginuid module, with this line in /etc/pam.d/crond: ' session required pam_loginuid.so ' Removing this line should workaround the problem. Since you are seeing audit messages in the system log, and you've listed the 'audit-libs' RPM but not the 'audit' rpm, it would appear you are not running the audit daemon. If you run the audit daemon ( install the audit RPM and 'service auditd start' ) does the problem still persist ?
If it's the pam_setcred call that fails removing the pam_loginuid.so won't help. Also you should try to update kernel to the latest available in FC4 updates if you haven't done so already.
I read through the strace & see a problem. The area of interest is the recvfrom calls. The first says that the user message type is einval. This indicates an old kernel is being used. The message is retried using the deprecated message type which now return eperm. The audit code checks the uid to see that its not 0 to make an exception. It turns out to be 0 as returned by getuid32. This indicates that root has lost CAP_AUDIT_WRITE.
Additionally, its pam accounting thats failing in the strace. 23900 sendto(8, "t\0\0\0\355\3\5\0\2\0\0\0\0\0\0\0PAM accounting: user=root exe=\"/usr/sbin/crond\" (hostname=?, addr=?, terminal=cron result=Success)\0\0", 116, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 116 23900 select(9, [8], NULL, NULL, {0, 100000}) = 1 (in [8], left {0, 100000}) 23900 recvfrom(8, "$\0\0\0\2\0\0\0\2\0\0\0\\]\0\0\377\377\377\377t\0\0\0\355\3\5\0\2\0\0\0\0\0\0\0", 8476, MSG_PEEK|MSG_DONTWAIT, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 36 The 4 octal 377s mean -EPERM.
Thanks, Tomas & Steve. So it appears the fix to this bug is to upgrade to the latest kernel ( 2.6.12-1.1390_FC4 ). Please can the reporter try this and let us know if it fixes the problem.
I am looking at the kernel logic to see where this is coming from. So we can better identify this in the future, could you say what kernel are you using? Also, are you doing anything that would cause CAP_AUDIT_WRITE to have been lost? I agree with the others that perhaps trying the latest kernel could help the situation.
I cannot reproduce this problem with latest FC-4 versions of kernel, pam, and audit. If you can, please re-open this bug - closing as "CURRENTRELEASE".
Yes, you are true; it happened on an hardened kernel where CAP_AUDIT_WRITE was too new to be considered a secure.