Bug 1368977 - systemctl start xxx hung, strace -p pid show: recvmsg(4, 0x7fffd1450970, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
Summary: systemctl start xxx hung, strace -p pid show: recvmsg(4, 0x7fffd1450970, MSG...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: systemd-maint
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-22 09:27 UTC by yzh
Modified: 2020-02-25 14:59 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-25 12:04:20 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description yzh 2016-08-22 09:27:00 UTC
Description of problem:
1、I have two service hung when systemctl start xxx(but this rarely happens ):
a、
/bin/bash -xe ./post_deploy.d/sso_proxy.sh
 \_ /bin/bash -xe ./resources/sso_proxy/install.sh
     \_ systemctl start hssoproxy
b、
/bin/hotplugusb
  \_sh -c systemctl start authorize_server
     \_systemctl start authorize_server

2、Two service didn't do anything else, except calling  daemon(), When systemctl start the service;

3、strace -p pid -s 1024 :
ppoll([{fd=4, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=4, revents=POLLIN}])
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1I\0\0\0\233\245\1\0p\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
..........................
recvmsg(4, 0x7fffd1450970, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)

ppoll([{fd=4, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=4, revents=POLLIN}])
...........
recvmsg(4, 0x7fffd1450970, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)

Two services are like this, as the above has been circulating.

Version-Release number of selected component (if applicable):


How reproducible:
This situation rarely occurs, the phenomenon is difficult to reproduce

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Lukáš Nykrýn 2016-08-22 10:35:34 UTC
It is better to use --no-block when you are calling systemctl from script. Especially when tehere is a chance that that script will be called from systemd service.

Comment 3 yzh 2016-08-22 10:59:26 UTC
(In reply to Lukáš Nykrýn from comment #2)
> It is better to use --no-block when you are calling systemctl from script.
> Especially when tehere is a chance that that script will be called from
> systemd service.

I don't understander the means what's you said. What is the reason for this service to hung?  What can I do to solve this problem,or to avoid the problem? I think this a bug of systemd, because of i didn't do anything, except call daemon(), And I think other service also have the problem when systemctl start xxx .

Comment 4 yzh 2016-08-22 11:00:11 UTC
(In reply to Lukáš Nykrýn from comment #2)
> It is better to use --no-block when you are calling systemctl from script.
> Especially when tehere is a chance that that script will be called from
> systemd service.

I don't understander the means what's you said. What is the reason for this service to hung?  What can I do to solve this problem,or to avoid the problem? I think this a bug of systemd, because of i didn't do anything, except call daemon(), And I think other service also have the problem when systemctl start xxx .

Comment 5 yzh 2016-08-23 07:58:41 UTC
Hi, 
  There is a lot of service hung when systemctl start xxx today. The services also didn't do anything, except call daemon(). 
  I find the process of the service socket with pid = 1(/usr/lib/systemd/systemd --switched-root),so the log of strace is systemctl Internal socket communication. 
  You give advice to use --no-block, I think it is no use, because of my service is xxxx.service , no xxxx.socket.  
  I need your help, and now this problem has affected the normal work. I hope you can give me some advice to avoid the problem. Had better be able to find the cause of the problem, I also look at the code here to find way to solve the problem.
  Thank you very much.

Comment 6 Jan Synacek 2017-01-25 12:04:20 UTC
There's nothing we can do with so little information. If you still have the problem, please provide a reproducer.

Comment 7 Paul Clements 2020-02-25 14:57:32 UTC
I have the same problem. Here is the end of the strace of:

systemctl start lifekeeper


recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1B\0\0\0\6\0\0\0q\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 178}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 178
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1M\0\0\0\7\0\0\0z\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 197}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 197
recvmsg(3, 0x7ffd8ea88e20, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=3, revents=POLLIN}])
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1H\0\0\0\10\0\0\0\226\0\0\0\1\1o\0004\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1/unit/s"..., 216}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 216
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1\304\2\0\0\t\0\0\0\226\0\0\0\1\1o\0004\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1/unit/s"..., 852}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 852
recvmsg(3, 0x7ffd8ea88e20, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=3, revents=POLLIN}])
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1Q\0\0\0\n\0\0\0x\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 193}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 193
recvmsg(3, 0x7ffd8ea88e20, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=3, revents=POLLIN}])
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1Q\0\0\0\v\0\0\0p\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 185}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 185
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1Q\0\0\0\f\0\0\0x\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 193}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 193
recvmsg(3, 0x7ffd8ea88e20, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=3, revents=POLLIN}])
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1Q\0\0\0\r\0\0\0p\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 185}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 185
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1Q\0\0\0\16\0\0\0x\0\0\0\1\1o\0\31\0\0\0", 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., 193}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=1, uid=0, gid=0}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 193
recvmsg(3, 0x7ffd8ea88e20, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8


systemctl just hangs and doesn't ever progress.

I can add --no-block, and it doesn't hang, but it just exits 0 and doesn't start the service:

systemctl start --no-block --no-ask-password lifekeeper; echo $?
0

Comment 8 Paul Clements 2020-02-25 14:59:14 UTC
system information:

rpm -q systemd
systemd-219-67.el7_7.2.x86_64

uname -a
Linux baymax 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux


Note You need to log in before you can comment on or make changes to this bug.