Bug 690292 - /etc/init.d/network script is not started, so many other services fail to start
Summary: /etc/init.d/network script is not started, so many other services fail to start
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 15
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Lennart Poettering
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F15Blocker, F15FinalBlocker 693504
TreeView+ depends on / blocked
 
Reported: 2011-03-23 19:47 UTC by Andrew McNabb
Modified: 2011-05-28 00:59 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 693504 (view as bug list)
Environment:
Last Closed: 2011-04-06 18:01:39 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
/var/log/messages (825.00 KB, application/octet-stream)
2011-03-23 19:47 UTC, Andrew McNabb
no flags Details
dmesg.txt (45.13 KB, text/plain)
2011-03-23 19:47 UTC, Andrew McNabb
no flags Details
systemd-dump.txt (459.10 KB, text/plain)
2011-03-23 19:48 UTC, Andrew McNabb
no flags Details
systemd-test.txt (29 bytes, text/plain)
2011-03-23 19:48 UTC, Andrew McNabb
no flags Details
boot.log with network.service enabled (2.41 KB, text/plain)
2011-03-25 18:38 UTC, Andrew McNabb
no flags Details

Description Andrew McNabb 2011-03-23 19:47:27 UTC
Created attachment 487129 [details]
/var/log/messages

Description of problem:

I'm running Fedora 15 Alpha, and when I boot, many services fail to start.  I've noticed that /tmp and /var/tmp are mounted read-only, which is likely the reason that the other services are failing.  From /proc/mounts:

/dev/vda2 /tmp ext4 ro,relatime,barrier=1,data=ordered 0 0
/dev/vda2 /var/tmp ext4 ro,relatime,barrier=1,data=ordered 0 0


Version-Release number of selected component (if applicable):

systemd-20-1.fc15.x86_64

How reproducible:

I've tried rebooting 2 or 3 times, and it seems to happen every time.

Steps to Reproduce:
1. Boot
  
Actual results:

Services fail to start, and when I am finally able to login (on the console), /tmp is read-only.

Additional info:

/proc/cmdline: ro root=UUID=4910e6a5-77b5-4fec-b2d1-5448cebd8e25 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet

I will attach the files described at http://fedoraproject.org/wiki/How_to_debug_Systemd_problems.

Comment 1 Andrew McNabb 2011-03-23 19:47:52 UTC
Created attachment 487130 [details]
dmesg.txt

Comment 2 Andrew McNabb 2011-03-23 19:48:17 UTC
Created attachment 487131 [details]
systemd-dump.txt

Comment 3 Andrew McNabb 2011-03-23 19:48:40 UTC
Created attachment 487132 [details]
systemd-test.txt

Comment 4 Michal Schmidt 2011-03-23 20:40:47 UTC
there are some bad ordering cycles removing important units:

[    5.052814] systemd[1]: Found ordering cycle on basic.target/start
[    5.052833] systemd[1]: Walked on cycle path to udev-retry.service/start
[    5.052849] systemd[1]: Walked on cycle path to rpcbind.service/start
[    5.052863] systemd[1]: Walked on cycle path to lvm2-monitor.service/start
[    5.052876] systemd[1]: Walked on cycle path to basic.target/start
[    5.052890] systemd[1]: Breaking ordering cycle by deleting job udev-retry.service/start
[    5.053076] systemd[1]: Found ordering cycle on basic.target/start
[    5.053093] systemd[1]: Walked on cycle path to sockets.target/start
[    5.053107] systemd[1]: Walked on cycle path to dbus.socket/start
[    5.053121] systemd[1]: Walked on cycle path to sysinit.target/start
[    5.053134] systemd[1]: Walked on cycle path to local-fs.target/start
[    5.053148] systemd[1]: Walked on cycle path to quotacheck.service/start
[    5.053161] systemd[1]: Walked on cycle path to aml.mount/start
[    5.053174] systemd[1]: Walked on cycle path to network.target/start
[    5.053187] systemd[1]: Walked on cycle path to network.service/start
[    5.053200] systemd[1]: Walked on cycle path to sandbox.service/start
[    5.053213] systemd[1]: Walked on cycle path to basic.target/start
[    5.053227] systemd[1]: Breaking ordering cycle by deleting job dbus.socket/start
[    5.053324] systemd[1]: Found ordering cycle on basic.target/start
[    5.053338] systemd[1]: Walked on cycle path to sysinit.target/start
[    5.053351] systemd[1]: Walked on cycle path to local-fs.target/start
[    5.053364] systemd[1]: Walked on cycle path to quotacheck.service/start
[    5.053378] systemd[1]: Walked on cycle path to aml.mount/start
[    5.053391] systemd[1]: Walked on cycle path to network.target/start
[    5.053404] systemd[1]: Walked on cycle path to network.service/start
[    5.053417] systemd[1]: Walked on cycle path to sandbox.service/start
[    5.053461] systemd[1]: Walked on cycle path to basic.target/start
[    5.053475] systemd[1]: Breaking ordering cycle by deleting job local-fs.target/start

Comment 5 Andrew McNabb 2011-03-23 21:04:47 UTC
That's really interesting.  It may help to mention that I disabled NetworkManager.service because my configuration needs to use network.service (since NetworkManager doesn't support bridges).  Perhaps part of the problem is that network.service is less well tested and has a problem with its dependencies (?).

Comment 6 Andrew McNabb 2011-03-24 18:12:31 UTC
I just reinstalled my test machine with Fedora 15 Alpha. This time I had it pull in the development repository for Fedora 15 in case anything has been updated very recently. Unfortunately, there does not appear to have been any change. Is there any other information that I can provide that would be helpful? Thanks.

Comment 7 Jóhann B. Guðmundsson 2011-03-25 16:31:58 UTC
Could you explain a bit the partition layout you are using.

I just did a next next next install and I'm unable to duplicate this report.

Comment 8 Jóhann B. Guðmundsson 2011-03-25 16:34:18 UTC
Default next next install ..

Kernel command line: ro root=/dev/mapper/VolGroup-lv_root rd_LVM_LV=VolGroup/lv_root rd_LVM_LV=VolGroup/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=is-latin1 rhgb quiet

Your.. 

Kernel command line: ro root=UUID=4910e6a5-77b5-4fec-b2d1-5448cebd8e25 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet

Comment 9 Andrew McNabb 2011-03-25 16:49:11 UTC
Jóhann, I'm using the following lines in a kickstart script:

part /boot --fstype ext3 --ondisk=vda --asprimary
part / --size=16000 --fstype ext4 --ondisk=vda --asprimary
part swap --size=2048 --fstype swap --ondisk=vda --asprimary
part /local --size=100 --grow --fstype ext4 --ondisk=vda --asprimary --label=local

However, I'm guessing that the problem is more related to the dependency cycle that Michal noticed rather than the partitioning layout. In fact, I've found that I can add "systemd.unit=rescue.target" to the kernel command line, and then hit CTRL-D when the rescue prompt appears, and then the system boots normally: networking works, gdm starts, etc. It seems that when systemd starts rescue mode first, it's able to figure out the dependencies correctly.

Comment 10 Andrew McNabb 2011-03-25 16:52:41 UTC
Jóhann, when you tried to reproduce, did you run "systemctl disable NetworkManager.service"?  I think this might be critical to reproducing.

Comment 11 Jóhann B. Guðmundsson 2011-03-25 17:03:55 UTC
I only check if /var/tmp was the cause ( as the subject indicated ) 

If you disable NetworkManager you effectivly break everything that uses the network.target so you have to create a network.service and add to that WantedBy=network.target

Comment 12 Andrew McNabb 2011-03-25 17:30:39 UTC
There is already a network.service shipped by default with Fedora, and this service is in fact critical for any configuration that NetworkManager still doesn't support.  In my case, our network configuration involves a bridge, but NetworkManager still doesn't support bridging, so switching from NetworkManager.service to network.service is the only way to get a working system.

Comment 13 Jóhann B. Guðmundsson 2011-03-25 17:44:21 UTC
Please do not confuse /etc/init.d/network with /lib/systemd/system/network.service <-- which does not exist unless you created it. 

You will need to create a network.service that gets pulled in by the network.target as a replacement for NetworkManager if you dont want to break everything that uses network.target.

Comment 14 Bill Nottingham 2011-03-25 17:54:47 UTC
(In reply to comment #13)
> Please do not confuse /etc/init.d/network with
> /lib/systemd/system/network.service <-- which does not exist unless you created
> it. 
> 
> You will need to create a network.service that gets pulled in by the
> network.target as a replacement for NetworkManager if you dont want to break
> everything that uses network.target.

You shouldn't... the network SysV service provides $network.

Example:

$ systemctl show network.target
Id=network.target
Names=network.target
Wants=NetworkManager.service
...
After=network.service
...

Comment 15 Andrew McNabb 2011-03-25 17:55:10 UTC
Jóhann, thanks for the clarification.  I'm still learning about systemd, which is particularly hard to do while Fedora 15 is still in alpha.  I appreciate your help.

Shouldn't /lib/systemd/system/network.service be provided with systemd?  Users upgrading to Fedora 15 should be able to reasonably expect that "chkconfig NetworkManager off; chkconfig network on" (or "systemctl disable NetworkManager.service; systemctl enable network.service") will work and do the right thing.  Supporting /etc/init.d/network out of the box seems critical until some point in the future when NetworkManager supports all of its features.

Comment 16 Jóhann B. Guðmundsson 2011-03-25 18:14:13 UTC
> You shouldn't... the network SysV service provides $network.
> 
> Example:
> 
> $ systemctl show network.target
> Id=network.target
> Names=network.target
> Wants=NetworkManager.service <-- does this not have to be wants network.service? 
> ...
> After=network.service  
> ...

Hum not computing this..

How is providing enough for example I would think that you need something like to make it usable with other targets like multi-user.target.. 

[Unit]
Description=Network
After=syslog.target
Conflicts=NetworkManager.service

[Service]
# quick hack since we have not converted the service yet.. 
ExecStart=/etc/init.d/network start

[Install]
WantedBy=network.target multi-user.target

Comment 17 Andrew McNabb 2011-03-25 18:38:06 UTC
Created attachment 487640 [details]
boot.log with network.service enabled

Jóhann, I added the example you provided to "/lib/systemd/system/network.service" and enabled it, and it mostly worked.  The only problem I have noticed so far is that it failed to mount an NFS volume:

[root@testvm ~]# systemctl status aml.mount
aml.mount - /aml
	  Loaded: loaded
	  Active: failed since Fri, 25 Mar 2011 12:28:16 -0600; 46s ago
	   Where: /aml
	    What: nfs:/
	 Process: 364 ExecMount=/bin/mount /aml (code=exited, status=32)
	  CGroup: name=systemd:/system/aml.mount

From boot.log, it looks like it did "Starting /aml" pretty early in the boot sequence.  Oddly, "Starting Network" didn't even appear in boot.log, even though it seems to have happened.  I'll attach boot.log to give an example.

Anyway, once the kinks are worked out, it seems like this bug would be fixed by including this network.service file in systemd.

Comment 18 Jóhann B. Guðmundsson 2011-03-25 19:12:53 UTC
Most likely it's the nfs section in relation to fedora-readonly.service that's complaining there try adding network.target in the After= line in that service 

You should have recived the same complaint at bootup if you had used NetworkManager as well.

Note that I just threw in that network.service file from top of my head as a responce to Bill's comment so you get to keep all the pony's and glory that comes with it ;)

Comment 19 Andrew McNabb 2011-03-25 20:29:25 UTC
Jóhann, are you sure that fedora-readonly is the right service?  After hunting around a little bit, I'm thinking that maybe there needs to be a "Before=remote-fs.target" line added to "network.target" or something like that.  Any thoughts about what I should try?  I'm a little too new to this to completely understand how the different units interact.

Should the network.service file be added to the systemd-units package, or does it belong in the initscripts package?  Is there anyone here who can make the addition?

Comment 20 Jóhann B. Guðmundsson 2011-03-25 20:43:44 UTC
(In reply to comment #19)
> Jóhann, are you sure that fedora-readonly is the right service?  After hunting
> around a little bit, I'm thinking that maybe there needs to be a
> "Before=remote-fs.target" line added to "network.target" or something like
> that.  Any thoughts about what I should try?  I'm a little too new to this to
> completely understand how the different units interact.

remote-fs.target already has

Requires=network.target <---
After=network.target <--- ( started after the network.target )

> Should the network.service file be added to the systemd-units package, or does
> it belong in the initscripts package?  Is there anyone here who can make the
> addition?

Lennart ( which is away btw ) or Bill but they probably want to check with Dan first to see if NM 0.9 ( which should hit F15 soon ) finally supports bridge networking before doing that..

Comment 21 Bill Nottingham 2011-03-25 20:49:23 UTC
(In reply to comment #16)
> > You shouldn't... the network SysV service provides $network.
> > 
> > Example:
> > 
> > $ systemctl show network.target
> > Id=network.target
> > Names=network.target
> > Wants=NetworkManager.service <-- does this not have to be wants network.service? 
> > ...
> > After=network.service  
> > ...
> 
> Hum not computing this..


You shouldn't need a network.service file, *period*. /etc/init.d/network *is* network.service, and is scheduled before network.target due to the fact that it provides $network.

Comment 22 Andrew McNabb 2011-03-25 20:59:45 UTC
(In reply to comment #20)
> (In reply to comment #19)
> 
> remote-fs.target already has
> 
> Requires=network.target <---
> After=network.target <--- ( started after the network.target )

Since systemd is trying to mount /aml before network.service is started, it looks like something isn't quite working right.


> Lennart ( which is away btw ) or Bill but they probably want to check with Dan
> first to see if NM 0.9 ( which should hit F15 soon ) finally supports bridge
> networking before doing that..

This would assume that 1) NetworkManager 0.9 hits before Fedora 15 beta, 2) there is no remaining functionality in /etc/init.d/network that is still missing in NM, and 3) no one experiences any NM bugs requiring them to go back to /etc/init.d/network.  If any of these assumptions is missed, then people will need network.service.  Additionally, people may have scripts (as I have) saying "chkconfig NetworkManager off; chkconfig network on", and since there is an /etc/init.d/network file, these commands will succeed, but the system will fail to boot properly, and there will be no indication of what went wrong.

It seems to me that as long as /etc/init.d/network is in Fedora, there should be a matching network.service file to make the system work.

Comment 23 Andrew McNabb 2011-03-25 21:00:52 UTC
(In reply to comment #21)
> 
> You shouldn't need a network.service file, *period*. /etc/init.d/network *is*
> network.service, and is scheduled before network.target due to the fact that it
> provides $network.

Fixing this would make me just as happy as having a network.service file. :)

Comment 24 Jóhann B. Guðmundsson 2011-03-25 21:10:20 UTC
(In reply to comment #21)
> (In reply to comment #16)
> > > You shouldn't... the network SysV service provides $network.
> > > 
> > > Example:
> > > 
> > > $ systemctl show network.target
> > > Id=network.target
> > > Names=network.target
> > > Wants=NetworkManager.service <-- does this not have to be wants network.service? 
> > > ...
> > > After=network.service  
> > > ...
> > 
> > Hum not computing this..
> 
> 
> You shouldn't need a network.service file, *period*. /etc/init.d/network *is*
> network.service, and is scheduled before network.target due to the fact that it
> provides $network.

I though the network.target was the top and any network service NetworkManager or otherwize belonged beneath it and they needed to contain wantedby=network.target if they wanted to become the primary network provider.. 

I stand corrected...

Comment 25 Andrew McNabb 2011-04-01 19:34:26 UTC
I upgraded to systemd 22, and this problem is now worse. If I run "chkconfig NetworkManager off", it says "Forwarding request to 'systemctl disable NetworkManager.service'" but then doing "chkconfig --list NetworkManager" shows that it is still enabled. If I run "systemctl disable NetworkManager.service", then "chkconfig --list" still shows that it is enabled. I'm not quite sure what native command should show whether systemd is enabled in systemd; I thought that "systemctl status NetworkManager.service" might do this, but "enabled" or "disabled" doesn't appear anywhere in the output. Anyway, it looks like there is no longer any way to disable NetworkManager.service.

Comment 26 Michal Schmidt 2011-04-02 12:57:07 UTC
(In reply to comment #25)
> I'm not quite sure what native command should show whether [NetworkManager]
> is enabled in systemd;

systemctl is-enabled NetworkManager.service && echo yes

In case some of the utilities are misbehaving, please show us the output of:
ls /etc/rc?.d/*Net*
find /etc/systemd/system -name '*Net*'

> Anyway, it looks like there is no longer any way to disable
> NetworkManager.service.

I think it is in fact disabled, but chkconfig cannot be trusted to show that.
Well, what did "systemctl status NetworkManager.service" say? Is NM active or not?

Comment 27 Andrew McNabb 2011-04-04 17:23:15 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > I'm not quite sure what native command should show whether [NetworkManager]
> > is enabled in systemd;
> 
> systemctl is-enabled NetworkManager.service && echo yes

It does not print anything, so apparently it's not enabled.


> In case some of the utilities are misbehaving, please show us the output of:
> ls /etc/rc?.d/*Net*
> find /etc/systemd/system -name '*Net*'

[root@testvm ~]# ls /etc/rc?.d/*Net*
/etc/rc0.d/K84NetworkManager  /etc/rc2.d/S23NetworkManager  /etc/rc4.d/S23NetworkManager  /etc/rc6.d/K84NetworkManager
/etc/rc1.d/K84NetworkManager  /etc/rc3.d/S23NetworkManager  /etc/rc5.d/S23NetworkManager
[root@testvm ~]# find /etc/systemd/system -name '*Net*'
[root@testvm ~]#


> I think it is in fact disabled, but chkconfig cannot be trusted to show that.
> Well, what did "systemctl status NetworkManager.service" say? Is NM active or
> not?

This command, unfortunately, does not include "enabled" or "disabled" anywhere in its output:

NetworkManager.service - Network Manager
	  Loaded: loaded (/lib/systemd/system/NetworkManager.service)
	  Active: inactive (dead)
	  CGroup: name=systemd:/system/NetworkManager.service

Comment 28 Michal Schmidt 2011-04-04 19:55:10 UTC
(In reply to comment #27)
> > systemctl is-enabled NetworkManager.service && echo yes
> 
> It does not print anything, so apparently it's not enabled.

Yes, the native NetworkManager.service is disabled on your system.

> > In case some of the utilities are misbehaving, please show us the output of:
> > ls /etc/rc?.d/*Net*
> > find /etc/systemd/system -name '*Net*'
> 
> [root@testvm ~]# ls /etc/rc?.d/*Net*
> /etc/rc0.d/K84NetworkManager  /etc/rc2.d/S23NetworkManager 
> /etc/rc4.d/S23NetworkManager  /etc/rc6.d/K84NetworkManager
> /etc/rc1.d/K84NetworkManager  /etc/rc3.d/S23NetworkManager 
> /etc/rc5.d/S23NetworkManager

This is why chkconfig reports NM as enabled in runlevels 2, 3, 4, 5.
But since the native unit trumps the legacy script, the legacy setting has no effect. Didn't chkconfig print a 3-line warning stressing this point?

> [root@testvm ~]# find /etc/systemd/system -name '*Net*'
> [root@testvm ~]#

This confirms that NetworkManager.service is not enabled, because it is not linked from any of the *.wants directories and neither is there the dbus-org.freedesktop.NetworkManager.service symlink present, so NM cannot even be activated by D-Bus.

> > Well, what did "systemctl status NetworkManager.service" say? Is NM active
> > or not?
> 
> This command, unfortunately, does not include "enabled" or "disabled" anywhere
> in its output:
> 
> NetworkManager.service - Network Manager
>    Loaded: loaded (/lib/systemd/system/NetworkManager.service)
>    Active: inactive (dead)
>    CGroup: name=systemd:/system/NetworkManager.service

Right, but what I wanted to have confirmed by this, is that the service is in fact inactive.

Anyway, we're way offtopic. This bug was originally about services failing to start due to the filesystem staying read-only. It is my understanding this problem has been fixed, so I'm closing this bug. Please reopen if it still happens.

Comment 29 Andrew McNabb 2011-04-04 20:17:38 UTC
This bug (and I'm updating the title) is really about the /etc/init.d/network script not starting.  This bug is not fixed in any way, so the bug should not be closed.

Comment 30 Andrew McNabb 2011-04-04 20:23:46 UTC
The problem about the SysV init script for NetworkManager staying enabled is now cloned into bug #693504.

Comment 31 Michal Schmidt 2011-04-04 20:36:35 UTC
(In reply to comment #29)
> This bug (and I'm updating the title) is really about the /etc/init.d/network
> script not starting.  This bug is not fixed in any way, so the bug should not
> be closed.

Thanks for the clarification. In that case could you please attach the output of dmesg after booting with the current systemd and with the boot parameters "log_buf_len=1M systemd.log_level_debug systemd.log_target=kmsg"? I have a reasons to believe that it will be quite different from the older logs.

Comment 32 Michal Schmidt 2011-04-04 20:37:25 UTC
(In reply to comment #31)
> "log_buf_len=1M systemd.log_level_debug systemd.log_target=kmsg"?

Sorry, typo.

log_buf_len=1M systemd.log_level=debug systemd.log_target=kmsg

Comment 33 Bill Nottingham 2011-04-04 20:45:02 UTC
(In reply to comment #29)
> This bug (and I'm updating the title) is really about the /etc/init.d/network
> script not starting.  This bug is not fixed in any way, so the bug should not
> be closed.

How does it not start, when it is properly enabled? (It's not enabled by default, because NM is.)

Comment 34 Andrew McNabb 2011-04-04 20:46:41 UTC
With layer upon layer of systemd bugs, it's getting hard for me to isolate them.  I'll try to get my system in a state where I can get you the information you are requesting.

Comment 35 Andrew McNabb 2011-04-04 20:49:24 UTC
(In reply to comment #33)
> (In reply to comment #29)
> > This bug (and I'm updating the title) is really about the /etc/init.d/network
> > script not starting.  This bug is not fixed in any way, so the bug should not
> > be closed.
> 
> How does it not start, when it is properly enabled? (It's not enabled by
> default, because NM is.)

Running "chkconfig NetworkManager off; chkconfig network on" does not work for several different reasons.  It's a bit challenging to keep the reports separate.
In any case, NetworkManager enabling itself by default via the SysV init scripts seems to be a relatively new phenomenon.

Comment 36 Andrew McNabb 2011-04-04 21:02:57 UTC
Okay, I've deleted /etc/rc?.d/*NetworkManager, and from the output of systemctl, I think that NetworkManager is no longer starting and that /etc/init.d/network is now starting.  What's the best way to tell for sure?  The one thing I've been able to find is the following in the output from systemctl:

network.service           loaded active running       LSB: Bring up/down networking

which I think is a good sign.  Unfortunately, I've had to make a lot of manual changes to this system, so I don't have much confidence in its current state.  Now that systemd 22 is in the updates repository, I'll try reinstalling the machine and see whether it works then.

Comment 37 Andrew McNabb 2011-04-06 16:49:05 UTC
I've reinstalled with a clean and up-to-date system, and I can confirm that the /etc/init.d/network script is started. It looks like this bug is fixed in systemd-22-1.fc15.x86_64. Thanks for those who looked into it.

Comment 38 Michal Schmidt 2011-04-06 18:01:39 UTC
OK, closing. If a similar problem reappears, please file a new bug. This one was getting a bit confusing.


Note You need to log in before you can comment on or make changes to this bug.