Bug 1424940
Summary: | [systemd] rpc.nfsd: writing fd to kernel failed: errno 110 (Connection timed out) | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | JianHong Yin <jiyin> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED DUPLICATE | QA Contact: | JianHong Yin <jiyin> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 25 | CC: | bcodding, bfields, chunwang, eguan, jiyin, jlayton, smayhew, steved, yoyang |
Target Milestone: | --- | Keywords: | Regression, Reproducer, TestBlocker |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-04-07 15:26:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1422947 |
Description
JianHong Yin
2017-02-20 06:33:12 UTC
How reproducible: always ''' [root@bkr-hv03-guest40 ~]# echo "/usr/share *(ro)" >/etc/exports [root@bkr-hv03-guest40 ~]# systemctl stop nfs [root@bkr-hv03-guest40 ~]# time systemctl start nfs Job for nfs-server.service failed because the control process exited with error code. See "systemctl status nfs-server.service" and "journalctl -xe" for details. real 8m47.408s user 0m0.003s sys 0m0.005s [root@bkr-hv03-guest40 ~]# systemctl status nfs ● nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2017-02-20 01:46:04 EST; 2min 23s ago Process: 13858 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=1/FAILURE) Process: 13857 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) Main PID: 13858 (code=exited, status=1/FAILURE) Feb 20 01:37:17 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: Starting NFS server and services... Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com rpc.nfsd[13858]: rpc.nfsd: writing fd to kernel failed: errno 110 (Connection timed out) Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com rpc.nfsd[13858]: rpc.nfsd: unable to set any sockets for nfsd Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Main process exited, code=exited, status=1/FAILURE Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: Failed to start NFS server and services. Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Unit entered failed state. Feb 20 01:46:04 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Failed with result 'exit-code'. ''' I think the problem may be with the latest rpcbind update. What happens if you run this before step 3 in your reproducer?: systemd-tmpfiles --create Or better yet, try this: systemd-tmpfiles --create /usr/lib/tmpfiles.d/rpcbind.conf I think that probably needs to be added to the postinstall scriptlet of rpcbind. (In reply to Scott Mayhew from comment #4) > Or better yet, try this: > > systemd-tmpfiles --create /usr/lib/tmpfiles.d/rpcbind.conf > > I think that probably needs to be added to the postinstall scriptlet of > rpcbind. seems systemd-tmpfiles can not solve the problem: ''' [root@bkr-hv03-guest40 ~]# cat /etc/exports /usr/share *(ro) [root@bkr-hv03-guest40 ~]# systemd-tmpfiles --create /usr/lib/tmpfiles.d/rpcbind.conf [root@bkr-hv03-guest40 ~]# time systemctl start nfs Job for nfs-server.service failed because the control process exited with error code. See "systemctl status nfs-server.service" and "journalctl -xe" for details. real 8m47.313s user 0m0.004s sys 0m0.005s [root@bkr-hv03-guest40 ~]# time systemctl status nfs ● nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2017-02-20 21:48:47 EST; 1min 7s ago Process: 15662 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=1/FAILURE) Process: 15661 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) Main PID: 15662 (code=exited, status=1/FAILURE) Feb 20 21:40:00 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: Starting NFS server and services... Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com rpc.nfsd[15662]: rpc.nfsd: writing fd to kernel failed: errno 110 (Connection timed out) Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com rpc.nfsd[15662]: rpc.nfsd: unable to set any sockets for nfsd Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Main process exited, code=exited, status=1/FAILURE Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: Failed to start NFS server and services. Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Unit entered failed state. Feb 20 21:48:47 bkr-hv03-guest40.dsal.lab.eng.bos.redhat.com systemd[1]: nfs-server.service: Failed with result 'exit-code'. real 0m0.009s user 0m0.005s sys 0m0.002s ''' ''' 15564 pts/1 Ss 0:00 | \_ -bash 15651 pts/1 S+ 0:00 | \_ systemctl start nfs 15652 pts/1 S+ 0:00 | \_ /usr/bin/systemd-tty-ask-password ''' (In reply to Scott Mayhew from comment #4) > Or better yet, try this: > > systemd-tmpfiles --create /usr/lib/tmpfiles.d/rpcbind.conf > > I think that probably needs to be added to the postinstall scriptlet of > rpcbind. I think this commit took care of the problem you are talking about: commit 0df1187eb1cde0df1a2cd1ffb09ab7782a9b5471 Author: Steve Dickson <steved> Date: Sat Jan 28 15:46:33 2017 -0500 Corrected boot dependency in systemd files (bz 1401561) Signed-off-by: Steve Dickson <steved> (In reply to Yin.JianHong from comment #5) > ''' > 15564 pts/1 Ss 0:00 | \_ -bash > 15651 pts/1 S+ 0:00 | \_ systemctl start nfs > 15652 pts/1 S+ 0:00 | \_ > /usr/bin/systemd-tty-ask-password > ''' The question I have is why is systemd-tty-ask-password being called?? I'm thinking that's where the time is being spent... (In reply to Steve Dickson from comment #6) > (In reply to Scott Mayhew from comment #4) > > Or better yet, try this: > > > > systemd-tmpfiles --create /usr/lib/tmpfiles.d/rpcbind.conf > > > > I think that probably needs to be added to the postinstall scriptlet of > > rpcbind. > > I think this commit took care of the problem you are talking about: > > commit 0df1187eb1cde0df1a2cd1ffb09ab7782a9b5471 > Author: Steve Dickson <steved> > Date: Sat Jan 28 15:46:33 2017 -0500 > > Corrected boot dependency in systemd files (bz 1401561) > > Signed-off-by: Steve Dickson <steved> I see... I can reproduce the problem easily though, and in my case the problem is that nfsd can't register with rpcbind because rpcbind's not running, and rpcbind's not running because /run/rpcbind doesn't exist and it can't create the lock file. [root@localhost ~]# systemctl status nfs ● nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/generator/nfs-server.service.d └─order-with-mounts.conf Active: failed (Result: exit-code) since Tue 2017-02-21 08:38:49 EST; 15s ago Process: 28004 ExecStopPost=/usr/sbin/exportfs -f (code=exited, status=0/SUCCESS) Process: 28002 ExecStopPost=/usr/sbin/exportfs -au (code=exited, status=0/SUCCESS) Process: 27998 ExecStop=/usr/sbin/rpc.nfsd 0 (code=exited, status=0/SUCCESS) Process: 28035 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=1/FAILURE) Process: 28034 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) Main PID: 28035 (code=exited, status=1/FAILURE) Feb 20 11:36:38 localhost.localdomain systemd[1]: Starting NFS server and services... Feb 21 08:38:49 localhost.localdomain rpc.nfsd[28035]: rpc.nfsd: writing fd to kernel failed: errno 110 (Connection timed out) Feb 21 08:38:49 localhost.localdomain rpc.nfsd[28035]: rpc.nfsd: unable to set any sockets for nfsd Feb 21 08:38:49 localhost.localdomain systemd[1]: nfs-server.service: Main process exited, code=exited, status=1/FAILURE Feb 21 08:38:49 localhost.localdomain systemd[1]: Failed to start NFS server and services. Feb 21 08:38:49 localhost.localdomain systemd[1]: nfs-server.service: Unit entered failed state. Feb 21 08:38:49 localhost.localdomain systemd[1]: nfs-server.service: Failed with result 'exit-code'. [root@localhost ~]# systemctl status rpcbind ● rpcbind.service - RPC Bind Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2017-02-20 11:24:50 EST; 21h ago Docs: man:rpcbind(8) Main PID: 2879 (code=exited, status=1/FAILURE) Feb 20 11:24:50 localhost.localdomain systemd[1]: Starting RPC Bind... Feb 20 11:24:50 localhost.localdomain rpcbind[2879]: rpcbind: /run/rpcbind/rpcbind.lock: No such file or directory Feb 20 11:24:50 localhost.localdomain systemd[1]: rpcbind.service: Main process exited, code=exited, status=1/FAILURE Feb 20 11:24:50 localhost.localdomain systemd[1]: Failed to start RPC Bind. Feb 20 11:24:50 localhost.localdomain systemd[1]: rpcbind.service: Unit entered failed state. Feb 20 11:24:50 localhost.localdomain systemd[1]: rpcbind.service: Failed with result 'exit-code'. (In reply to Steve Dickson from comment #7) > (In reply to Yin.JianHong from comment #5) > > ''' > > 15564 pts/1 Ss 0:00 | \_ -bash > > 15651 pts/1 S+ 0:00 | \_ systemctl start nfs > > 15652 pts/1 S+ 0:00 | \_ > > /usr/bin/systemd-tty-ask-password > > ''' > The question I have is why is systemd-tty-ask-password being called?? > I'm thinking that's where the time is being spent... Just a thought... since rpcbind runs as the rpc user instead of root, maybe that's preventing it from using systemd-tmpfile-setup? As a side note... Maybe we should consider having rpc.nfsd print a more helpful error message when it sees -ETIMEDOUT here? That's almost always occurring because it can't talk to rpcbind. It might also not hurt to consider allowing knfsd to start up when rpcbind registration fails, but throw a printk to warn about missing registration. I think I finally fix this problem.... Please see https://bugzilla.redhat.com/show_bug.cgi?id=1401561 *** This bug has been marked as a duplicate of bug 1401561 *** |