Created attachment 1581753 [details] coredump Not sure if this is related to NIS. Server is running 4.20.16-200.fc29.x86_64 Notice the Duplicate cookie and below is a gdb of a coredump, which is included. Jun 14 00:48:55 ourserver systemd[39180]: Received SIGRTMIN+24 from PID 39199 (kill). Jun 14 00:48:56 ourserver systemd[1]: Stopped User Manager for UID 5472. Jun 14 00:48:56 ourserver systemd[1]: Stopping User Runtime Directory /run/user/5472... Jun 14 00:48:56 ourserver systemd[1]: Stopped User Runtime Directory /run/user/5472. Jun 14 00:48:56 ourserver systemd[1]: Removed slice User Slice of UID 5472. Jun 14 00:51:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 00:56:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:11:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:21:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:26:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:31:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:36:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:41:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:45:26 ourserver systemd[1]: Starting dnf makecache... Jun 14 01:45:26 ourserver dnf[22146]: Metadata cache refreshed recently. Jun 14 01:45:26 ourserver systemd[1]: Started dnf makecache. Jun 14 01:46:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:51:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 01:56:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 02:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 02:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 02:11:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 02:15:01 ourserver systemd[1]: Starting User Manager for UID 0... Jun 14 02:15:03 ourserver systemd[38437]: bind(/run/user/0/systemd/notify) failed: Address already in use Jun 14 02:15:03 ourserver systemd[38437]: Failed to fully start up daemon: Address already in use Jun 14 02:15:03 ourserver systemd[1]: user: Failed with result 'protocol'. Jun 14 02:15:03 ourserver systemd[1]: Failed to start User Manager for UID 0. Jun 14 02:15:03 ourserver systemd[1]: Started Session 19890 of user root. Jun 14 02:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 02:17:46 ourserver systemd[1]: Started /usr/bin/systemctl start man-db-cache-update. Jun 14 02:17:46 ourserver systemd[1]: Starting man-db-cache-update.service... Jun 14 02:17:46 ourserver systemd[1]: Started /usr/bin/systemctl start man-db-cache-update. Jun 14 02:18:09 ourserver systemd[1]: Started man-db-cache-update.service. Jun 14 02:19:27 ourserver systemd[1]: Starting User Manager for UID 0... Jun 14 02:19:32 ourserver systemd-logind[43885]: New session 19892 of user root. Jun 14 02:20:07 ourserver systemd[41574]: bind(/run/user/0/systemd/notify) failed: Address already in use Jun 14 02:20:07 ourserver systemd[41574]: Failed to fully start up daemon: Address already in use Jun 14 02:20:07 ourserver systemd[1]: user: Failed with result 'protocol'. Jun 14 02:20:07 ourserver systemd[1]: Failed to start User Manager for UID 0. Jun 14 02:20:07 ourserver systemd[1]: Started Session 19892 of user root. Jun 14 03:01:04 ourserver systemd[1]: Starting User Manager for UID 0... Jun 14 03:01:04 ourserver systemd[14158]: bind(/run/user/0/systemd/notify) failed: Address already in use Jun 14 03:01:04 ourserver systemd[14158]: Failed to fully start up daemon: Address already in use Jun 14 03:01:04 ourserver systemd[1]: user: Failed with result 'protocol'. Jun 14 03:01:04 ourserver systemd[1]: Failed to start User Manager for UID 0. Jun 14 03:01:04 ourserver systemd[1]: Started Session 19904 of user root. Jun 14 03:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:11:37 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:21:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:25:44 ourserver kernel: FS-Cache: Duplicate cookie detected Jun 14 03:25:44 ourserver kernel: FS-Cache: O-cookie c=0000000038870852 [p=0000000090eab239 fl=222 nc=0 na=1] Jun 14 03:25:44 ourserver kernel: FS-Cache: O-cookie d=000000001a86ab95 n=00000000b6293788 Jun 14 03:25:44 ourserver kernel: FS-Cache: O-key=[10] '040002000801966c4034' Jun 14 03:25:44 ourserver kernel: FS-Cache: N-cookie c=00000000d4cb8e2e [p=0000000090eab239 fl=2 nc=0 na=1] Jun 14 03:25:44 ourserver kernel: FS-Cache: N-cookie d=000000001a86ab95 n=00000000aaf3d265 Jun 14 03:25:44 ourserver kernel: FS-Cache: N-key=[10] '040002000801966c4034' Jun 14 03:26:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains. Jun 14 03:28:29 ourserver systemd-logind[43885]: Session 19904 logged out. Waiting for processes to exit. Jun 14 03:28:29 ourserver systemd-logind[43885]: Removed session 19904. Jun 14 03:28:34 ourserver systemd-logind[43885]: New session 19906 of user root. Jun 14 03:28:34 ourserver systemd[1]: Starting User Manager for UID 0... Jun 14 03:28:38 ourserver systemd[28928]: bind(/run/user/0/systemd/notify) failed: Address already in use Jun 14 03:28:38 ourserver systemd[28928]: Failed to fully start up daemon: Address already in use Jun 14 03:28:38 ourserver systemd[1]: user: Failed with result 'protocol'. Jun 14 03:28:38 ourserver systemd[1]: Failed to start User Manager for UID 0. Jun 14 03:28:38 ourserver systemd[1]: Started Session 19906 of user root. Jun 14 03:28:50 ourserver systemd-logind[43885]: Session 19906 logged out. Waiting for processes to exit. Jun 14 03:28:50 ourserver systemd-logind[43885]: Removed session 19906. Jun 14 03:28:50 ourserver systemd-logind[43885]: New session 19908 of user root. systemctl --version systemd 239 +PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid BFD: warning: /var/lib/systemd/coredump/core.systemd-journal.0.28154ddd19b243eda01287de82559656.31800.1560486433000000 is truncated: expected core file size >= 54960128, found: 4661248 BFD: warning: /var/lib/systemd/coredump/core.systemd-journal.0.28154ddd19b243eda01287de82559656.31800.1560486433000000 is truncated: expected core file size >= 54960128, found: 4661248 [New LWP 31800] warning: core file may not match specified executable file. Reading symbols from /usr/lib/systemd/systemd-journald...Reading symbols from /usr/lib/debug/usr/lib/systemd/systemd-journald-239-12.git8bca462.fc29.x86_64.debug...done. done. Cannot access memory at address 0x7fefce6f7128 Cannot access memory at address 0x7fefce6f7120 Failed to read a valid object file image from memory. Core was generated by `/usr/lib/systemd/systemd-journald'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fefce266760 in ?? () (gdb) bt full #0 0x00007fefce266760 in ?? () No symbol table info available. Backtrace stopped: Cannot access memory at address 0x7ffe24f7e2b8
Created attachment 1581770 [details] strace -f -p 1 -s 500 -o log2 this is a strace of during a systemctl restart user
This is sounding like an issue with NIS: Jun 21 16:32:13 ourhost systemd[26692]: user: Failed to determine user credentials: No such process Jun 21 16:32:13 ourhost systemd[26692]: user: Failed at step USER spawning /usr/lib/systemd/systemd: No such process
Please see what processes have /run/user/0/systemd/notify open: $ sudo lsof /run/user/0/systemd/notify And also check what systemd thinks about those PIDs: $ systemctl status <pid>
The strace isn't particularly interesting. All that happens is that bind() fails with EADDRINUSE and the service fails to start. If you have the lsof information, you can kill that process that holds the file open and then try to restart user. If that still fails, it'd be interesting to get a strace.
(In reply to Zbigniew Jędrzejewski-Szmek from comment #3) > Please see what processes have /run/user/0/systemd/notify open: > $ sudo lsof /run/user/0/systemd/notify > > And also check what systemd thinks about those PIDs: > $ systemctl status <pid> lsof /run/user/0/systemd/notify COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME systemd root 16u unix 0x000000005ba64022 0t0 21017520 /run/user/0/systemd/notify type=DGRAM systemctl status 26945 ● user - User Manager for UID 0 Loaded: loaded (/usr/lib/systemd/system/user@.service; static; vendor preset: disabled) Active: active (running) since Mon 2019-08-05 16:28:26 EDT; 21h ago Docs: man:user@.service(5) Main PID: 26945 (systemd) Status: "Startup finished in 115ms." Tasks: 2 Memory: 6.3M CGroup: /user.slice/user-0.slice/user └─init.scope ├─26945 /usr/lib/systemd/systemd --user └─26948 (sd-pam) Aug 06 12:48:42 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded. Aug 06 12:52:27 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded. Aug 06 12:59:57 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded. Aug 06 13:03:28 ourdomain.edu systemd[26945]: run-user-300.mount: Succeeded. Aug 06 13:17:28 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded. Aug 06 13:29:59 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded. Aug 06 13:33:45 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded. Aug 06 13:40:01 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded. Aug 06 13:56:17 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded. Aug 06 14:06:24 ourdomain.edu systemd[26945]: run-user-300.mount: Succeeded.
This message is a reminder that Fedora 29 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '29'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 29 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.