Bug 773433 - NFS mount fails: network.service starts after remote-fs
Summary: NFS mount fails: network.service starts after remote-fs
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-11 19:56 UTC by Hans Ecke
Modified: 2012-10-30 17:58 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-30 17:58:17 UTC
Type: ---


Attachments (Terms of Use)
output of "dmesg -s10000000" (221.99 KB, text/plain)
2012-01-11 19:56 UTC, Hans Ecke
no flags Details
output from "systemctl dump" (445.93 KB, text/plain)
2012-01-11 19:57 UTC, Hans Ecke
no flags Details
output from "/bin/systemd --test --system --log-level=debug 2>&1" (460.50 KB, text/plain)
2012-01-11 19:58 UTC, Hans Ecke
no flags Details
a copy of /var/log/messages (64.45 KB, text/plain)
2012-01-11 19:59 UTC, Hans Ecke
no flags Details
dmesg after setting the fs_passno field of NFS mounts to 0 (248.93 KB, text/plain)
2012-01-12 00:06 UTC, Hans Ecke
no flags Details

Description Hans Ecke 2012-01-11 19:56:38 UTC
Created attachment 552213 [details]
output of "dmesg -s10000000"

Description of problem:

Systemd attempts to mount remote filesystems before it executes network.service. Accordingly, the NFS mount times out.

This is on Fedora 16, without NetworkManager but instead with the static network.service.

Version-Release number of selected component (if applicable):

37-3.fc16

How reproducible:

Always

Steps to Reproduce:
1. restart system
2. it hangs till the NFS mounts timeout
3. login on console and do a manual "mount -a"
  
Actual results:

NFS does not get mounted because the network is not ready yet

Expected results:

network.service (and rpcbind etc) should execute pretty early in bootup, maybe as soon as udev renames eth0 to em1. Then the NFS mount would succeed.

Additional info:

I worked through a lot of the systemd/network related bugs here and I understand it _should_ work. Something seems to be different on my system. Could you help me find what?

I attach the info requested on http://fedoraproject.org/wiki/How_to_debug_Systemd_problems

/proc/cmdline:

BOOT_IMAGE=/vmlinuz-3.1.7-1.fc16.x86_64 root=UUID=cc66242e-9e3a-401b-8f0b-fcade9972710 ro rd.md=0 rd.lvm=0 rd.dm=0 KEYTABLE=us selinux=0 SYSFONT=latarcyrheb-sun16 rd.luks=0 LANG=en_US.UTF-8 rdblacklist=nouveau nouveau.modeset=0 log_buf_len=1048576 systemd.log_level=debug systemd.log_target=kmsg

dmesg output .. attached

/var/log/messages and output from systemctl dump and systemd --test: I can not see how I can attach more than one file to the initial bug report, so I'll do that momentarily...

Comment 1 Hans Ecke 2012-01-11 19:57:20 UTC
Created attachment 552214 [details]
output from "systemctl dump"

Comment 2 Hans Ecke 2012-01-11 19:58:23 UTC
Created attachment 552215 [details]
output from "/bin/systemd --test --system --log-level=debug 2>&1"

This was done as user "hans". systemd refuses to run this as root.

Comment 3 Hans Ecke 2012-01-11 19:59:19 UTC
Created attachment 552216 [details]
a copy of /var/log/messages

... I deleted everything before the bootup

Comment 4 Hans Ecke 2012-01-11 20:08:06 UTC
A few comments:

1.) I changed the /etc/init.d/network script with these lines:

echo "Running /etc/init.d/network (stderr)" >&2
logger "Running /etc/init.d/network (syslog)"
echo "Running /etc/init.d/network (kmsg)" > /dev/kmsg

at the very top. You can see the output in dmesg.txt at 95.657956. Obviously the network should have been brought up much earlier....

2.) systemd unit files are unchanged. /etc/init.d/network is unchanged except for the above 3 lines.

3.) root@bose:/mpl/bose 518> rpm -qa|fgrep NetworkM

NetworkManager-gtk-0.9.2-1.fc16.x86_64
NetworkManager-glib-0.9.2-1.fc16.x86_64

4.) root@bose:/mpl/bose 519> systemctl is-enabled network.service

network.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig network --level=5
enabled

5.) root@bose:/mpl/bose 521> systemctl is-enabled remote-fs.target
enabled

Comment 5 Michal Schmidt 2012-01-11 23:18:32 UTC
There's a timeout waiting for a non-sensical device unit:
[   93.780946] systemd[1]: Job pavlov:-data.device/start timed out.

Do you have a non-zero value in the sixth fstab field (fs_passno) in /etc/fstab for any of the NFS mounts? I think that's what confuses systemd here.

Comment 6 Hans Ecke 2012-01-12 00:05:38 UTC
That was indeed the case, and now it mounts correctly. Thank you Michal!

However, I still get a huge wait/timeout. I'll attach a new dmesg file so you can see it. Any thoughts?

Comment 7 Hans Ecke 2012-01-12 00:06:55 UTC
Created attachment 552261 [details]
dmesg after setting the fs_passno field of NFS mounts to 0

Comment 8 Michal Schmidt 2012-01-12 01:10:59 UTC
From the log:

[   24.887032] systemd[1]: About to execute: /etc/rc.d/init.d/network start
[   24.911899] systemd[1]: Forked /etc/rc.d/init.d/network as 971
[   24.911941] systemd[1]: network.service changed dead -> start
[   24.922464] logger[983]: Running /etc/init.d/network (syslog)
[   24.922569] Running /etc/init.d/network (kmsg)

OK, the network script has been spawned.

[   27.470124] systemd[1]: Child 971 belongs to network.service
[   27.470137] systemd[1]: network.service: control process exited, code=exited status=0

The network script finished.

[   27.470142] systemd[1]: network.service got final SIGCHLD for state start
[   27.470193] systemd[1]: network.service changed start -> running
[   27.470206] systemd[1]: Job network.service/start finished, result=done
[   27.491866] systemd[1]: network.target changed dead -> active
[   27.491873] systemd[1]: Job network.target/start finished, result=done

Mounting starts at this point.

[   29.360845] systemd[1]: Received SIGCHLD from PID 1439 (ifup-eth).
[   29.360878] systemd[1]: Got SIGCHLD for process 1439 (ifup-eth)
[   29.385851] systemd[1]: Received SIGCHLD from PID 1444 (ifup-eth).
[   29.385882] systemd[1]: Got SIGCHLD for process 1444 (ifup-eth)
[   29.387120] systemd[1]: network.service: cgroup is empty

It's odd that the two ifup-eth processes outlived the network script and finished only now. I don't know where this asynchronicity comes from.

Comment 9 Michal Sekletar 2012-08-03 12:44:35 UTC
Are you experiencing the same issue, with latest versions of systemd, initscripts and bash? I consulted this issue with maintainer of initscripts and we didn't figure out how is it possible for /etc/rc.d/init.d/network to exit earlier than ifup-eth script.

Comment 10 Hans Ecke 2012-08-15 08:56:43 UTC
Hi Michal-

I do not see this bug anymore with current package versions. Feel free to close issue and thank you for your work.

Hans


Note You need to log in before you can comment on or make changes to this bug.