Bug 1222746
Summary: | rsyslog crashes while writing to disk queue (MsgSerialize) | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Susant Sahani <ssahani> | ||||||||||||||
Component: | rsyslog | Assignee: | Radovan Sroka <rsroka> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Stefan Dordevic <sdordevi> | ||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | high | ||||||||||||||||
Version: | 7.1 | CC: | cww, moagrawa, pvrabec, rsroka, rvdwees, ssahani | ||||||||||||||
Target Milestone: | rc | Keywords: | Patch | ||||||||||||||
Target Release: | 7.2 | ||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2016-11-04 05:35:39 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 1203710, 1295396, 1296594, 1313485 | ||||||||||||||||
Attachments: |
|
Description
Susant Sahani
2015-05-19 04:20:54 UTC
(gdb) bt #0 0x00007f3b370475d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f3b37048cc8 in __GI_abort () at abort.c:90 #2 0x00007f3b38698a08 in sigsegvHdlr (signum=6) at debug.c:849 #3 <signal handler called> #4 0x00007f3b370475d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #5 0x00007f3b37048cc8 in __GI_abort () at abort.c:90 #6 0x00007f3b37087e07 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f3b371908c8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #7 0x00007f3b3708f1fd in malloc_printerr (ptr=<optimized out>, str=0x7f3b371909a0 "double free or corruption (!prev)", action=3) at malloc.c:4972 Double free. The malloc chunk is corrupted now. #8 _int_free (av=0x7f3b373cc760 <main_arena>, p=<optimized out>, have_lock=1) at malloc.c:3804 #9 0x00007f3b37091a50 in _int_realloc (av=av@entry=0x7f3b373cc760 <main_arena>, oldp=oldp@entry=0x7f3b0c001a30, oldsize=oldsize@entry=400, nb=nb@entry=752) at malloc.c:4295 #10 0x00007f3b37092702 in __GI___libc_realloc (oldmem=0x7f3b0c001a40, bytes=bytes@entry=737) at malloc.c:2998 We got a abort on frame 10 on glibc #11 0x00007f3b377f58a0 in printbuf_extend (min_size=<optimized out>, p=0x7f3b0c001a20) at printbuf.c:72 We trying to extend the buffer which essentially calls to realloc #12 printbuf_memappend (p=p@entry=0x7f3b0c001a20, buf=buf@entry=0x7f3b31bdea30 "\"", size=size@entry=1) at printbuf.c:82 #13 0x00007f3b377f5a56 in sprintbuf (p=p@entry=0x7f3b0c001a20, msg=msg@entry=0x7f3b377f5c79 "\"") at printbuf.c:175 #14 0x00007f3b377f194d in json_object_string_to_json_string (jso=0x7f3b2c1bf9b0, pb=0x7f3b0c001a20, level=<optimized out>, flags=<optimized out>) at json_object.c:622 #15 0x00007f3b377f1ebd in json_object_object_to_json_string (jso=<optimized out>, pb=0x7f3b0c001a20, level=0, flags=1) at json_object.c:325 #16 0x00007f3b377f1d5c in json_object_to_json_string_ext (jso=0x7f3b2c0d9440, flags=flags@entry=1) at json_object.c:270 #17 0x00007f3b377f1daa in json_object_to_json_string (jso=<optimized out>) at json_object.c:280 #18 0x00007f3b377f25a5 in json_object_get_string (jso=<optimized out>) at json_object.c:665 #19 0x00007f3b38690e60 in MsgSerialize (pThis=0x7f3b2c1bffc0, pStrm=0x7f3b3a0d2e60) at msg.c:1068 <====== rsyslog is trying to write to disk queue here. It does use the json libray to do so. #20 0x00007f3b386a230c in qAddDisk (pThis=0x7f3b3a0d2b10, pMsg=0x7f3b2c1bffc0) at queue.c:902 #21 0x00007f3b386a5eee in qqueueAdd (pMsg=<optimized out>, pThis=0x7f3b3a0d2b10) at queue.c:1024 #22 doEnqSingleObj (pMsg=0x7f3b2c1bffc0, flowCtlType=eFLOWCTL_NO_DELAY, pThis=0x7f3b3a0d2b10) at queue.c:2637 #23 qqueueEnqMsg (pThis=0x7f3b3a0d2b10, flowCtlType=flowCtlType@entry=eFLOWCTL_NO_DELAY, pMsg=<optimized out>) at queue.c:2738 #24 0x00007f3b386a6155 in ConsumerDA (pThis=0x7f3b3a0c40d0, pWti=0x7f3b3a0d21d0) at queue.c:1922 #25 0x00007f3b386a1ad8 in wtiWorker (pThis=pThis@entry=0x7f3b3a0d21d0) at wti.c:313 #26 0x00007f3b386a0d61 in wtpWorker (arg=0x7f3b3a0d21d0) at wtp.c:388 #27 0x00007f3b38010df5 in start_thread (arg=0x7f3b31bdf700) at pthread_create.c:308 #28 0x00007f3b371081ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Hello Susant, thanks for you help. The daemon definitely shouldn't crash but if it doesn't happen often, it will be hard to pinpoint the underlying issue. We don't need backtraces as much as we need a reliable reproducer. Do we have access to their configuration? Created attachment 1087135 [details]
01446126-Conf
Created attachment 1097995 [details]
Full rsyslog configuration is attached
Created attachment 1122166 [details]
Reproducer
Here is the reproducer.
Created attachment 1122840 [details]
patch
This patch introduces new mutex inside of message structure to handle totally thread unsafe library json-c and locks suspicious json function calls.
Created attachment 1127297 [details]
updated patch
changed type from int to rsRetVal
Created attachment 1181194 [details]
Script to reproduce the crash
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2401.html |