Bug 1721574 - bind(/run/user/0/systemd/notify) failed: Address already in use
Summary: bind(/run/user/0/systemd/notify) failed: Address already in use
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: nss_nis
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matej Mužila
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-18 15:06 UTC by RobbieTheK
Modified: 2019-11-27 23:11 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-27 23:11:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
coredump (4.45 MB, application/x-core)
2019-06-18 15:06 UTC, RobbieTheK
no flags Details
strace -f -p 1 -s 500 -o log2 (932.42 KB, text/plain)
2019-06-18 15:27 UTC, RobbieTheK
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github systemd systemd issues 9656 0 'None' closed systemd --user sometimes fails with systemd[1]: user@1000.service: Failed with result 'protocol' 2020-12-14 21:47:19 UTC
Red Hat Bugzilla 1023820 0 unspecified CLOSED slow shutdown unit user@0.service entered failed state 2021-02-22 00:41:40 UTC

Description RobbieTheK 2019-06-18 15:06:26 UTC
Created attachment 1581753 [details]
coredump

Not sure if this is related to NIS. Server is running 4.20.16-200.fc29.x86_64

Notice the Duplicate cookie and below is a gdb of a coredump, which is included.

Jun 14 00:48:55 ourserver systemd[39180]: Received SIGRTMIN+24 from PID 39199 (kill).
Jun 14 00:48:56 ourserver systemd[1]: Stopped User Manager for UID 5472.
Jun 14 00:48:56 ourserver systemd[1]: Stopping User Runtime Directory /run/user/5472...
Jun 14 00:48:56 ourserver systemd[1]: Stopped User Runtime Directory /run/user/5472.
Jun 14 00:48:56 ourserver systemd[1]: Removed slice User Slice of UID 5472.
Jun 14 00:51:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 00:56:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:11:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:21:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:26:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:31:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:36:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:41:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:45:26 ourserver systemd[1]: Starting dnf makecache...
Jun 14 01:45:26 ourserver dnf[22146]: Metadata cache refreshed recently.
Jun 14 01:45:26 ourserver systemd[1]: Started dnf makecache.
Jun 14 01:46:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:51:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 01:56:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 02:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 02:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 02:11:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 02:15:01 ourserver systemd[1]: Starting User Manager for UID 0...
Jun 14 02:15:03 ourserver systemd[38437]: bind(/run/user/0/systemd/notify) failed: Address already in use
Jun 14 02:15:03 ourserver systemd[38437]: Failed to fully start up daemon: Address already in use
Jun 14 02:15:03 ourserver systemd[1]: user: Failed with result 'protocol'.
Jun 14 02:15:03 ourserver systemd[1]: Failed to start User Manager for UID 0.
Jun 14 02:15:03 ourserver systemd[1]: Started Session 19890 of user root.
Jun 14 02:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 02:17:46 ourserver systemd[1]: Started /usr/bin/systemctl start man-db-cache-update.
Jun 14 02:17:46 ourserver systemd[1]: Starting man-db-cache-update.service...
Jun 14 02:17:46 ourserver systemd[1]: Started /usr/bin/systemctl start man-db-cache-update.
Jun 14 02:18:09 ourserver systemd[1]: Started man-db-cache-update.service.
Jun 14 02:19:27 ourserver systemd[1]: Starting User Manager for UID 0...
Jun 14 02:19:32 ourserver systemd-logind[43885]: New session 19892 of user root.
Jun 14 02:20:07 ourserver systemd[41574]: bind(/run/user/0/systemd/notify) failed: Address already in use
Jun 14 02:20:07 ourserver systemd[41574]: Failed to fully start up daemon: Address already in use
Jun 14 02:20:07 ourserver systemd[1]: user: Failed with result 'protocol'.
Jun 14 02:20:07 ourserver systemd[1]: Failed to start User Manager for UID 0.
Jun 14 02:20:07 ourserver systemd[1]: Started Session 19892 of user root.

Jun 14 03:01:04 ourserver systemd[1]: Starting User Manager for UID 0...
Jun 14 03:01:04 ourserver systemd[14158]: bind(/run/user/0/systemd/notify) failed: Address already in use
Jun 14 03:01:04 ourserver systemd[14158]: Failed to fully start up daemon: Address already in use
Jun 14 03:01:04 ourserver systemd[1]: user: Failed with result 'protocol'.
Jun 14 03:01:04 ourserver systemd[1]: Failed to start User Manager for UID 0.
Jun 14 03:01:04 ourserver systemd[1]: Started Session 19904 of user root.
Jun 14 03:01:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:06:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:11:37 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:16:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:21:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:25:44 ourserver kernel: FS-Cache: Duplicate cookie detected
Jun 14 03:25:44 ourserver kernel: FS-Cache: O-cookie c=0000000038870852 [p=0000000090eab239 fl=222 nc=0 na=1]
Jun 14 03:25:44 ourserver kernel: FS-Cache: O-cookie d=000000001a86ab95 n=00000000b6293788
Jun 14 03:25:44 ourserver kernel: FS-Cache: O-key=[10] '040002000801966c4034'
Jun 14 03:25:44 ourserver kernel: FS-Cache: N-cookie c=00000000d4cb8e2e [p=0000000090eab239 fl=2 nc=0 na=1]
Jun 14 03:25:44 ourserver kernel: FS-Cache: N-cookie d=000000001a86ab95 n=00000000aaf3d265
Jun 14 03:25:44 ourserver kernel: FS-Cache: N-key=[10] '040002000801966c4034'
Jun 14 03:26:35 ourserver ypbind[20820]: 20822: Ping active servers for all domains.
Jun 14 03:28:29 ourserver systemd-logind[43885]: Session 19904 logged out. Waiting for processes to exit.
Jun 14 03:28:29 ourserver systemd-logind[43885]: Removed session 19904.
Jun 14 03:28:34 ourserver systemd-logind[43885]: New session 19906 of user root.
Jun 14 03:28:34 ourserver systemd[1]: Starting User Manager for UID 0...
Jun 14 03:28:38 ourserver systemd[28928]: bind(/run/user/0/systemd/notify) failed: Address already in use
Jun 14 03:28:38 ourserver systemd[28928]: Failed to fully start up daemon: Address already in use
Jun 14 03:28:38 ourserver systemd[1]: user: Failed with result 'protocol'.
Jun 14 03:28:38 ourserver systemd[1]: Failed to start User Manager for UID 0.
Jun 14 03:28:38 ourserver systemd[1]: Started Session 19906 of user root.
Jun 14 03:28:50 ourserver systemd-logind[43885]: Session 19906 logged out. Waiting for processes to exit.
Jun 14 03:28:50 ourserver systemd-logind[43885]: Removed session 19906.
Jun 14 03:28:50 ourserver systemd-logind[43885]: New session 19908 of user root.
systemctl --version
systemd 239
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid
BFD: warning: /var/lib/systemd/coredump/core.systemd-journal.0.28154ddd19b243eda01287de82559656.31800.1560486433000000 is truncated: expected core file size >= 54960128, found: 4661248
BFD: warning: /var/lib/systemd/coredump/core.systemd-journal.0.28154ddd19b243eda01287de82559656.31800.1560486433000000 is truncated: expected core file size >= 54960128, found: 4661248
[New LWP 31800]

warning: core file may not match specified executable file.
Reading symbols from /usr/lib/systemd/systemd-journald...Reading symbols from /usr/lib/debug/usr/lib/systemd/systemd-journald-239-12.git8bca462.fc29.x86_64.debug...done.
done.
Cannot access memory at address 0x7fefce6f7128
Cannot access memory at address 0x7fefce6f7120
Failed to read a valid object file image from memory.
Core was generated by `/usr/lib/systemd/systemd-journald'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fefce266760 in ?? ()
(gdb) bt full
#0  0x00007fefce266760 in ?? ()
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x7ffe24f7e2b8

Comment 1 RobbieTheK 2019-06-18 15:27:46 UTC
Created attachment 1581770 [details]
strace -f -p 1 -s 500 -o log2

this is a strace of during a systemctl restart user

Comment 2 RobbieTheK 2019-06-21 20:40:19 UTC
This is sounding like an issue with NIS:

Jun 21 16:32:13 ourhost systemd[26692]: user: Failed to determine user credentials: No such process
Jun 21 16:32:13 ourhost systemd[26692]: user: Failed at step USER spawning /usr/lib/systemd/systemd: No such process

Comment 3 Zbigniew Jędrzejewski-Szmek 2019-08-06 15:44:15 UTC
Please see what processes have /run/user/0/systemd/notify open:
$ sudo lsof /run/user/0/systemd/notify

And also check what systemd thinks about those PIDs:
$ systemctl status <pid>

Comment 4 Zbigniew Jędrzejewski-Szmek 2019-08-06 16:06:19 UTC
The strace isn't particularly interesting. All that happens is that bind() fails with EADDRINUSE
and the service fails to start. If you have the lsof information, you can kill that process
that holds the file open and then try to restart user. If that still fails, it'd
be interesting to get a strace.

Comment 5 RobbieTheK 2019-08-06 18:15:14 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #3)
> Please see what processes have /run/user/0/systemd/notify open:
> $ sudo lsof /run/user/0/systemd/notify
> 
> And also check what systemd thinks about those PIDs:
> $ systemctl status <pid>


lsof /run/user/0/systemd/notify
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
systemd 	root   16u  unix 0x000000005ba64022      0t0 21017520 /run/user/0/systemd/notify type=DGRAM

systemctl status 26945
● user - User Manager for UID 0
   Loaded: loaded (/usr/lib/systemd/system/user@.service; static; vendor preset: disabled)
   Active: active (running) since Mon 2019-08-05 16:28:26 EDT; 21h ago
     Docs: man:user@.service(5)
 Main PID: 26945 (systemd)
   Status: "Startup finished in 115ms."
    Tasks: 2
   Memory: 6.3M
   CGroup: /user.slice/user-0.slice/user
           └─init.scope
             ├─26945 /usr/lib/systemd/systemd --user
             └─26948 (sd-pam)

Aug 06 12:48:42 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded.
Aug 06 12:52:27 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded.
Aug 06 12:59:57 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded.
Aug 06 13:03:28 ourdomain.edu systemd[26945]: run-user-300.mount: Succeeded.
Aug 06 13:17:28 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded.
Aug 06 13:29:59 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded.
Aug 06 13:33:45 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded.
Aug 06 13:40:01 ourdomain.edu systemd[26945]: u-professor1.mount: Succeeded.
Aug 06 13:56:17 ourdomain.edu systemd[26945]: u-professor2.mount: Succeeded.
Aug 06 14:06:24 ourdomain.edu systemd[26945]: run-user-300.mount: Succeeded.

Comment 6 Ben Cotton 2019-10-31 18:58:38 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Ben Cotton 2019-11-27 23:11:23 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.