Bug 1299804 - Systemd-journald crashed and dumped core while running I/O's on cifs mount with rhgs layered install
Systemd-journald crashed and dumped core while running I/O's on cifs mount wi...
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: systemd-maint
qe-baseos-daemons
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-19 05:03 EST by surabhi
Modified: 2018-02-14 18:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description surabhi 2016-01-19 05:03:04 EST
Description of problem:
On a setup where layered install of rhgs-server and rhgs-samba has been done on top of RHEL7.2 , while running i/o's on cifs mount and several other samba server side test cases multiple times there is a systemd crash seen.

****************************************

bt is as follows:

Core was generated by `/usr/lib/systemd/systemd-journald'.
Program terminated with signal 6, Aborted.
#0  0x00007f4c50f0408d in __GI_readlinkat (fd=-100, path=0x7ffc36e52b70 "/proc/1/exe", buf=0x7f4c527739d0 "/usr/lib/systemd/systemd", len=99)
    at ../sysdeps/unix/sysv/linux/readlinkat.c:45
45	      result = INLINE_SYSCALL (readlinkat, 4, fd, path, buf, len);
(gdb) bt
#0  0x00007f4c50f0408d in __GI_readlinkat (fd=-100, path=0x7ffc36e52b70 "/proc/1/exe", buf=0x7f4c527739d0 "/usr/lib/systemd/systemd", len=99)
    at ../sysdeps/unix/sysv/linux/readlinkat.c:45
#1  0x00007f4c52334257 in __readlinkat_alias () at /usr/include/bits/unistd.h:185
#2  readlinkat_malloc (p=p@entry=0x7ffc36e52b70 "/proc/1/exe", ret=ret@entry=0x7ffc36e52c78, fd=-100) at src/shared/util.c:1010
#3  0x00007f4c52334342 in readlink_malloc (p=<optimized out>, ret=<optimized out>) at src/shared/util.c:1029
#4  get_process_link_contents (name=0x7ffc36e52c78, proc_file=0x7ffc36e52b70 "/proc/1/exe") at src/shared/util.c:838
#5  get_process_exe (pid=<optimized out>, name=name@entry=0x7ffc36e52c78) at src/shared/util.c:853
#6  0x00007f4c52319688 in dispatch_message_real.4064 (s=s@entry=0x7ffc36e53520, iovec=iovec@entry=0x7f4c52777500, n=13, n@entry=9, m=m@entry=66, 
    ucred=ucred@entry=0x7ffc36e532e0, tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28, 
    unit_id=unit_id@entry=0x0, priority=priority@entry=27, object_pid=object_pid@entry=0) at src/journal/journald-server.c:597
#7  0x00007f4c5233ebcf in server_dispatch_message (s=s@entry=0x7ffc36e53520, iovec=0x7f4c52777500, n=n@entry=9, m=66, ucred=ucred@entry=0x7ffc36e532e0, 
    tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28, unit_id=unit_id@entry=0x0, 
    priority=priority@entry=27, object_pid=0) at src/journal/journald-server.c:917
#8  0x00007f4c5232d245 in server_process_native_message (s=s@entry=0x7ffc36e53520, buffer=<optimized out>, buffer_size=233, ucred=ucred@entry=0x7ffc36e532e0, 
    tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28) at src/journal/journald-native.c:286
#9  0x00007f4c5232da0e in server_process_datagram (es=<optimized out>, fd=4, revents=<optimized out>, userdata=0x7ffc36e53520) at src/journal/journald-server.c:1211
#10 0x00007f4c5232ee40 in source_dispatch (s=s@entry=0x7f4c52765480) at src/libsystemd/sd-event/sd-event.c:2115
#11 0x00007f4c5232ffba in sd_event_dispatch (e=e@entry=0x7f4c52765190) at src/libsystemd/sd-event/sd-event.c:2472
#12 0x00007f4c52314fac in sd_event_run (timeout=18446744073709551615, e=0x7f4c52765190) at src/libsystemd/sd-event/sd-event.c:2501
#13 main (argc=<optimized out>, argv=<optimized out>) at src/journal/journald.c:109

Version-Release number of selected component (if applicable):
cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.2 (Maipo)


How reproducible:
Seen once

Steps to Reproduce:
1. Do a layered install of rhgs-server samba on top of RHEl7.2
2. Start automation suite on cifs mount which has test cases like (mkdir, creating files, dd to create 1 GB file, renames, delete on mount , create vol delete vol multiple times.)
3.Check for failures, crashes, and errors in logs.

Actual results:
There is a systemd crash, also there was OOM kill invoked by smbd for which seperate BZ has been raised.

Expected results:
systemd should not crash.

Additional info:
Sosreports and core dump will be uploaded soon.
Comment 4 Lukáš Nykrýn 2016-06-09 04:03:14 EDT
It looks like that part of disk was not accessible and readlinkat hanged. So journal have not pinged the watchdog a systemd killed it and started it again. If there was some other issue with IO, it is expected behavior.

Note You need to log in before you can comment on or make changes to this bug.