Bug 1299804

Summary: Systemd-journald crashed and dumped core while running I/O's on cifs mount with rhgs layered install
Product: Red Hat Enterprise Linux 7 Reporter: surabhi <sbhaloth>
Component: systemdAssignee: systemd-maint
Status: CLOSED WONTFIX QA Contact: qe-baseos-daemons
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: systemd-maint-list
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-15 07:39:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description surabhi 2016-01-19 10:03:04 UTC
Description of problem:
On a setup where layered install of rhgs-server and rhgs-samba has been done on top of RHEL7.2 , while running i/o's on cifs mount and several other samba server side test cases multiple times there is a systemd crash seen.

****************************************

bt is as follows:

Core was generated by `/usr/lib/systemd/systemd-journald'.
Program terminated with signal 6, Aborted.
#0  0x00007f4c50f0408d in __GI_readlinkat (fd=-100, path=0x7ffc36e52b70 "/proc/1/exe", buf=0x7f4c527739d0 "/usr/lib/systemd/systemd", len=99)
    at ../sysdeps/unix/sysv/linux/readlinkat.c:45
45	      result = INLINE_SYSCALL (readlinkat, 4, fd, path, buf, len);
(gdb) bt
#0  0x00007f4c50f0408d in __GI_readlinkat (fd=-100, path=0x7ffc36e52b70 "/proc/1/exe", buf=0x7f4c527739d0 "/usr/lib/systemd/systemd", len=99)
    at ../sysdeps/unix/sysv/linux/readlinkat.c:45
#1  0x00007f4c52334257 in __readlinkat_alias () at /usr/include/bits/unistd.h:185
#2  readlinkat_malloc (p=p@entry=0x7ffc36e52b70 "/proc/1/exe", ret=ret@entry=0x7ffc36e52c78, fd=-100) at src/shared/util.c:1010
#3  0x00007f4c52334342 in readlink_malloc (p=<optimized out>, ret=<optimized out>) at src/shared/util.c:1029
#4  get_process_link_contents (name=0x7ffc36e52c78, proc_file=0x7ffc36e52b70 "/proc/1/exe") at src/shared/util.c:838
#5  get_process_exe (pid=<optimized out>, name=name@entry=0x7ffc36e52c78) at src/shared/util.c:853
#6  0x00007f4c52319688 in dispatch_message_real.4064 (s=s@entry=0x7ffc36e53520, iovec=iovec@entry=0x7f4c52777500, n=13, n@entry=9, m=m@entry=66, 
    ucred=ucred@entry=0x7ffc36e532e0, tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28, 
    unit_id=unit_id@entry=0x0, priority=priority@entry=27, object_pid=object_pid@entry=0) at src/journal/journald-server.c:597
#7  0x00007f4c5233ebcf in server_dispatch_message (s=s@entry=0x7ffc36e53520, iovec=0x7f4c52777500, n=n@entry=9, m=66, ucred=ucred@entry=0x7ffc36e532e0, 
    tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28, unit_id=unit_id@entry=0x0, 
    priority=priority@entry=27, object_pid=0) at src/journal/journald-server.c:917
#8  0x00007f4c5232d245 in server_process_native_message (s=s@entry=0x7ffc36e53520, buffer=<optimized out>, buffer_size=233, ucred=ucred@entry=0x7ffc36e532e0, 
    tv=tv@entry=0x7ffc36e532c0, label=label@entry=0x7ffc36e53300 "system_u:system_r:init_t:s0", label_len=label_len@entry=28) at src/journal/journald-native.c:286
#9  0x00007f4c5232da0e in server_process_datagram (es=<optimized out>, fd=4, revents=<optimized out>, userdata=0x7ffc36e53520) at src/journal/journald-server.c:1211
#10 0x00007f4c5232ee40 in source_dispatch (s=s@entry=0x7f4c52765480) at src/libsystemd/sd-event/sd-event.c:2115
#11 0x00007f4c5232ffba in sd_event_dispatch (e=e@entry=0x7f4c52765190) at src/libsystemd/sd-event/sd-event.c:2472
#12 0x00007f4c52314fac in sd_event_run (timeout=18446744073709551615, e=0x7f4c52765190) at src/libsystemd/sd-event/sd-event.c:2501
#13 main (argc=<optimized out>, argv=<optimized out>) at src/journal/journald.c:109

Version-Release number of selected component (if applicable):
cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.2 (Maipo)


How reproducible:
Seen once

Steps to Reproduce:
1. Do a layered install of rhgs-server samba on top of RHEl7.2
2. Start automation suite on cifs mount which has test cases like (mkdir, creating files, dd to create 1 GB file, renames, delete on mount , create vol delete vol multiple times.)
3.Check for failures, crashes, and errors in logs.

Actual results:
There is a systemd crash, also there was OOM kill invoked by smbd for which seperate BZ has been raised.

Expected results:
systemd should not crash.

Additional info:
Sosreports and core dump will be uploaded soon.

Comment 4 Lukáš Nykrýn 2016-06-09 08:03:14 UTC
It looks like that part of disk was not accessible and readlinkat hanged. So journal have not pinged the watchdog a systemd killed it and started it again. If there was some other issue with IO, it is expected behavior.

Comment 7 RHEL Program Management 2020-12-15 07:39:41 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.