Bug 1744682

Summary: rsyslog: imfile crash on log rotation
Product: Red Hat Enterprise Linux 7 Reporter: Stepan Broz <sbroz>
Component: rsyslogAssignee: Jiří Vymazal <jvymazal>
Status: CLOSED ERRATA QA Contact: Dalibor Pospíšil <dapospis>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.6CC: dapospis, fkrska, jvymazal, nhosoi, rmeggins, rmetrich, sradco
Target Milestone: rcKeywords: EasyFix, Patch, Reproducer, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rsyslog-8.24.0-46.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1744691 1746497 (view as bug list) Environment:
Last Closed: 2020-03-31 19:10:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1744691, 1746497    

Description Stepan Broz 2019-08-22 16:54:42 UTC
Description of problem:
When rsyslog is configured to use the imfile input module to e.g. forward /var/log/audit/audit.log, there is a chance of rsyslog crashing on audit.log rotation.

It seems to be "use after free" issue in act_obj_destroy().

Backtrace of the crash thread:

Program terminated with signal 11, Segmentation fault.
#0  0x0000556a048d9325 in getMSGLen (pM=0x203a293137393539) at msg.c:1684
1684		return((pM == NULL) ? 0 : pM->iLenMSG);
(gdb) bt
#0  0x0000556a048d9325 in getMSGLen (pM=0x203a293137393539) at msg.c:1684
#1  0x0000556a048f91e5 in doLastMessageRepeatedNTimes (ppRepMsg=0x7fb379268950, pMsg=0x7fb36c0d0f50, ratelimit=0x7fb36c004ce0)
    at ratelimit.c:84
#2  ratelimitMsg (ratelimit=0x7fb36c004ce0, pMsg=pMsg@entry=0x7fb36c0d0f50, ppRepMsg=ppRepMsg@entry=0x7fb379268950)
    at ratelimit.c:224
#3  0x0000556a048f93ee in ratelimitAddMsg (ratelimit=<optimized out>, pMultiSub=pMultiSub@entry=0x7fb36c002468, 
    pMsg=0x7fb36c0d0f50) at ratelimit.c:261
#4  0x00007fb37a0a5004 in enqLine (strtOffs=<optimized out>, cstrLine=0x7fb36c002260, act=0x7fb36c002400) at imfile.c:1120
#5  pollFileReal (pCStr=0x7fb3792689b8, act=0x7fb36c002400) at imfile.c:1317
#6  pollFile (act=act@entry=0x7fb36c002400) at imfile.c:1348
#7  0x00007fb37a0a5378 in act_obj_destroy (act=act@entry=0x7fb36c002400, is_deleted=is_deleted@entry=1) at imfile.c:816
#8  0x00007fb37a0a57cb in act_obj_unlink (act=act@entry=0x7fb36c002400) at imfile.c:901
#9  0x00007fb37a0a589f in detect_updates (edge=edge@entry=0x556a05377e00) at imfile.c:641
#10 0x00007fb37a0a6813 in poll_tree (chld=0x556a05377e00) at imfile.c:719
#11 0x00007fb37a0a268d in fs_node_walk (node=<optimized out>, f_usr=f_usr@entry=0x7fb37a0a67d0 <poll_tree>) at imfile.c:936
#12 0x00007fb37a0a2699 in fs_node_walk (node=<optimized out>, f_usr=f_usr@entry=0x7fb37a0a67d0 <poll_tree>) at imfile.c:937
#13 0x00007fb37a0a2699 in fs_node_walk (node=<optimized out>, f_usr=f_usr@entry=0x7fb37a0a67d0 <poll_tree>) at imfile.c:937
#14 0x00007fb37a0a2699 in fs_node_walk (node=<optimized out>, f_usr=f_usr@entry=0x7fb37a0a67d0 <poll_tree>) at imfile.c:937
#15 0x00007fb37a0a2699 in fs_node_walk (node=<optimized out>, f_usr=f_usr@entry=0x7fb37a0a67d0 <poll_tree>) at imfile.c:937
#16 0x00007fb37a0a5fc1 in doPolling () at imfile.c:1829
#17 runInput (pThrd=<optimized out>) at imfile.c:2056
#18 0x0000556a04901444 in thrdStarter (arg=0x556a0538a0c0) at ../threads.c:226
#19 0x00007fb37d7e1ea5 in start_thread (arg=0x7fb37926b700) at pthread_create.c:307
#20 0x00007fb37c8d58cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb) print *pM
Cannot access memory at address 0x203a293137393539

(gdb) frame 7
#7  0x00007fb37a0a5378 in act_obj_destroy (act=act@entry=0x7fb36c002400, is_deleted=is_deleted@entry=1) at imfile.c:816
816			pollFile(act); /* get any left-over data */
(gdb) list
811		if(act->ratelimiter != NULL) {
812			ratelimitDestruct(act->ratelimiter);
813		}
814		if(act->pStrm != NULL) {
815			const instanceConf_t *const inst = act->edge->instarr[0];// TODO: same file, multiple instances?
816			pollFile(act); /* get any left-over data */
817			if(inst->bRMStateOnDel) {
818				statefn = getStateFileName(act, statefile, sizeof(statefile));
819				getFullStateFileName(statefn, toDel, sizeof(toDel));
820				statefn = toDel;

(gdb) print *act->ratelimiter
$1 = {name = 0x4f52503d65707974 <Address 0x4f52503d65707974 out of bounds>, interval = 21571, burst = 21577, severity = 76 'L', 
  done = 1631414131, missed = 1953064053, begin = 3474865994935382312, bReduceRepeatMsgs = 775370037, nsupp = 976631346, 
  pMsg = 0x203a293137393539, bThreadSafe = 112 'p', bNoTimeCache = 114 'r', mut = {__data = {__lock = 842480997, 
      __count = 808469558, __owner = 909526066, __nusers = 909520944, __kind = 1127627062, __spins = 13622, __elision = 17973, 
      __list = {__prev = 0x3933313330333233, __next = 0x4432383330334432}}, __size = "e=726D002D660066696C655F323031392D30382D", 
    __align = 3472350312264056165}}

act->ratelimiter was already destructed (and freed) in act_obj_destroy(), but later used by ratelimitAddMsg(), etc.

Version-Release number of selected component (if applicable):
rsyslog-8.24.0-38.el7

How reproducible:
Always, but with random delay

Steps to Reproduce:
1. Create audit rule to audit file deletions.
2. Set max_log_file = 1 in /etc/audit/auditd.conf to increase rotation frequency.
3. Create and remove a file in a tight loop (in shell) to create high audit traffic.

    $ while true; do touch file_1; sleep .1; rm -f file_1; done

4. Configure rsyslog to use imfile on /var/log/audit/audit.log. E.g. create /etc/rsyslog.d/imfile.conf as follows:

    $ModLoad imfile


    $InputFileName /var/log/audit/audit.log
    $InputFileTag auditd
    $InputFileStateFile state-audit-log-remote
    $InputFileSeverity notice
    $InputFileFacility authpriv
    $InputFilePollInterval 10
    $InputFilePersistStateInterval 0
    $InputRunFileMonitor

5. Set Permissive SELinux mode

    # setenforce 0

6. Wait.

Actual results:
The rsyslogd daemon will eventually crash on log rotation made by auditd. A message similar to the following can appear in the journal:

Aug 22 18:16:07 kernel: traps: in:imfile[26885] general protection ip:556a048d9325 sp:7fb379268908 error:0 in rsyslogd[556a048b9000+95000]

Expected results:
No crash

Additional info:

This issue is not resolved in RHEL-8.
Upstream fix: https://github.com/rsyslog/rsyslog/pull/3052

Comment 2 Stepan Broz 2019-08-22 17:00:36 UTC
When rsyslogd is run in valgrind, similar logs can be seen on audit rotation:

==20765== Invalid read of size 4
==20765==    at 0x14810E: ratelimitMsg (ratelimit.c:223)
==20765==    by 0x1483ED: ratelimitAddMsg (ratelimit.c:261)
==20765==    by 0x8B9D003: enqLine (imfile.c:1120)
==20765==    by 0x8B9D003: pollFileReal (imfile.c:1317)
==20765==    by 0x8B9D003: pollFile (imfile.c:1348)
==20765==    by 0x8B9D377: act_obj_destroy (imfile.c:816)
==20765==    by 0x8B9D89E: detect_updates (imfile.c:641)
==20765==    by 0x8B9E812: poll_tree (imfile.c:719)
==20765==    by 0x8B9A68C: fs_node_walk (imfile.c:936)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9DFC0: doPolling (imfile.c:1829)
==20765==    by 0x8B9DFC0: runInput (imfile.c:2056)
==20765==  Address 0x63d51b0 is 32 bytes inside a block of size 96 free'd
==20765==    at 0x4C2AF5D: free (vg_replace_malloc.c:540)
==20765==    by 0x8B9D35D: act_obj_destroy (imfile.c:812)
==20765==    by 0x8B9D89E: detect_updates (imfile.c:641)
==20765==    by 0x8B9E812: poll_tree (imfile.c:719)
==20765==    by 0x8B9A68C: fs_node_walk (imfile.c:936)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9DFC0: doPolling (imfile.c:1829)
==20765==    by 0x8B9DFC0: runInput (imfile.c:2056)
==20765==    by 0x150443: thrdStarter (threads.c:226)
==20765==    by 0x5054EA4: start_thread (pthread_create.c:307)
==20765==  Block was alloc'd at
==20765==    at 0x4C2BF79: calloc (vg_replace_malloc.c:762)
==20765==    by 0x1484E9: ratelimitNew (ratelimit.c:291)
==20765==    by 0x8B9E306: act_obj_add (imfile.c:586)
==20765==    by 0x8B9E9C8: poll_tree (imfile.c:760)
==20765==    by 0x8B9A68C: fs_node_walk (imfile.c:936)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9A698: fs_node_walk (imfile.c:937)
==20765==    by 0x8B9DFC0: doPolling (imfile.c:1829)
==20765==    by 0x8B9DFC0: runInput (imfile.c:2056)
==20765==    by 0x150443: thrdStarter (threads.c:226)
==20765==    by 0x5054EA4: start_thread (pthread_create.c:307)

Comment 11 errata-xmlrpc 2020-03-31 19:10:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1000