Bug 1587995
Summary: | auditd sometimes in failed state after boot | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | redhat-airlock | ||||
Component: | audit | Assignee: | Steve Grubb <sgrubb> | ||||
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE Security Team <qe-baseos-security> | ||||
Severity: | high | Docs Contact: | Ioanna Gkioka <igkioka> | ||||
Priority: | urgent | ||||||
Version: | 7.5 | CC: | dpal, fkrska, igkioka, mmatsuya, omoris, rahranja, sgrubb | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | audit-2.8.4-3.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, systemd in certain cases terminated the audit daemon unexpectedly during startup. As a consequence, the audit.service was in a failed state after the boot process. With this update, an unnecessary function call has been removed, and the described problem no longer occurs.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1607298 (view as bug list) | Environment: | |||||
Last Closed: | 2018-10-30 11:28:31 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1607298 | ||||||
Attachments: |
|
Description
redhat-airlock
2018-06-06 12:44:47 UTC
Hello.By any chance did you try to run with the debug version of auditd mentioned in bug 1470810? It might be the same thing or maybe something else. Hi, The debug RPM on [1] seems to be no longer available. Even if it was, it is a different version (audit-2.7.7-2.fc26 vs audit-2.8.1-3.el7.x86_64). If it's possible for you to provide a debug version of the el7 RPM I can test it with that [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=918802 The debug patch is in this repo: git://pkgs.fedoraproject.org/rpms/audit , commit f37518040e617f75b7f8bc3e225b6c90fb9e4602 . It was difficult to find the bug, but we have found it :-) This is the stack of a hanging auditd process (pstack): Thread 2 (Thread 0x7f7f0c4c0700 (LWP 4999)): #0 0x00007f7f0e3c7995 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000055d600e8efc3 in flush_thread_main () #2 0x00007f7f0e3c3e25 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f7f0d6aebad in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f7f0f25a880 (LWP 4998)): #0 0x00007f7f0d6bc89c in __lll_lock_wait_private () from /lib64/libc.so.6 #1 0x00007f7f0d6669bc in _L_lock_2546 () from /lib64/libc.so.6 #2 0x00007f7f0d6667f7 in __tz_convert () from /lib64/libc.so.6 #3 0x00007f7f0d6a873a in __vsyslog_chk () from /lib64/libc.so.6 #4 0x00007f7f0e7e4025 in audit_msg () from /lib64/libaudit.so.1 #5 0x000055d600e94e6d in dispatcher_reaped () #6 0x000055d600e8e0a5 in child_handler () #7 <signal handler called> #8 0x00007f7f0d6a906a in mmap64 () from /lib64/libc.so.6 #9 0x00007f7f0d61de21 in __GI__IO_file_doallocate () from /lib64/libc.so.6 #10 0x00007f7f0d62cc27 in __GI__IO_doallocbuf () from /lib64/libc.so.6 #11 0x00007f7f0d62a61c in __GI__IO_file_xsgetn () from /lib64/libc.so.6 #12 0x00007f7f0d62945e in fread_unlocked () from /lib64/libc.so.6 #13 0x00007f7f0d666c8d in __tzfile_read () from /lib64/libc.so.6 #14 0x00007f7f0d665e64 in tzset_internal () from /lib64/libc.so.6 #15 0x00007f7f0d666813 in __tz_convert () from /lib64/libc.so.6 #16 0x00007f7f0d6a873a in __vsyslog_chk () from /lib64/libc.so.6 #17 0x00007f7f0e7e4025 in audit_msg () from /lib64/libaudit.so.1 #18 0x000055d600e94d45 in init_dispatcher () #19 0x000055d600e8d05e in main () This means that syslog() was called in a signal handler. syslog() is not async-signal-safe, see the "signal-safety" man page. The process hangs when it tries to call syslog() from the signal handler. We found some more bugs with a code review: FD_CLOEXEC is not set on init_pipe[1] (a simple typo) and EINTR handling is missing for a read(). With all 3 bugs fixed, auditd starts reliably without hanging. It would be great if you could integrate the bugfixes into an update. Created attachment 1454967 [details]
Proposed patch
Thanks for the patch and troubleshooting! I am truly surprised by this. The libev documentation says that signals are serialized with other events and not delivered asynchronously (from a signal handler). In any event, the message being logged is not important and is more like a debug level message. I should be able to fit this into the next release. Thanks! audit-2.8.4-3.el7 was built to address this issue. > I should be able to fit this into the next release. Thanks! Great! > The libev documentation says that signals are serialized with other events and not delivered asynchronously (from a signal handler). It happens during startup, probably before libev is in control. Probably it's this signal handler: /* This signal handler gets replaced later. Its here in case * the dispatcher exits before libev is in control */ sa.sa_handler = child_handler2; sigaction(SIGCHLD, &sa, NULL); child_handler2() calls child_handler(), and child_handler() calls dispatcher_reaped() which calls syslog(). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3237 |