with NM disabled and the network service enabled, the following standard bridge configuration fails ifup every time: [root@localhost 1105012]# cat /etc/sysconfig/network-scripts/ifcfg-br0 DEVICE=br0 ONBOOT=no HOTPLUG=yes TYPE=Bridge BOOTPROTO=dhcp STP=on DELAY=0 [root@localhost 1105012]# cat /etc/sysconfig/network-scripts/ifcfg-enp2s0 DEVICE=enp2s0 HWADDR=00:11:22:33:44:55 ONBOOT=no HOTPLUG=yes BRIDGE=br0 # ifup enp2s0; ifup br0; Determining IP information for br0... failed; no link present. Check cable? Since this was originally reported wrt using libvirt's "virsh iface-start" command (which calls a function in the netcf library), I at first thought there might be a problem with the order that netcf was ifup'ing the interfaces - in a discussion somewhere I'd seen someone mention that they were ifup'ing the bridge first, then the ethernets, which is the opposite of what netcf does. But manual experimentation shows that netcf is doing it in the correct order, and (as was suggested by someone triaging the original bug report) adding a sufficiently large LINKDELAY to ifcfg-br0 does solve the problem. However, we should not require every existing installation with a bridge device and STP enabled to modify their config. Instead, initscripts' ifup should properly account for this needed delay when it notices that STP is enabled. For the record, here is the sequence of events that leads to the problem: 1) "ifup $ether" calls /etc/sysconfig/network-scripts/ifup-eth; it does this: 1a) auto-create the $bridge *with an implicit 0 forward delay* but still "down". 1b) "ip link set dev $ether up" 1b) sleep for $LINKDELAY seconds (as set in the ifcfg-$ether, NOT the ifcfg-$bridge) 1c) brctl addif -- $bridge $ether (at this point if you look at "brctl showstp $bridge" you'll see that the $ether port is in "disabled" state) 2) "ifup $bridge" - this again ends up in /etc/sysconfig/network-scripts/ifup-eth, which: 2a) (doesn't create the bridge device, because it was already auto-created in step (1a). 2b) sets a forward delay and other bridge options according to ifcfg-$bridge 2c) *IF* the device has "BOOTPROTO=dhcp", it goes into a loop waiting for up to LINKDELAY seconds until /sys/class/net/$bridge/carrier contains "1" rather than "0". (NB: this will happen as soon as at least one device attached to the bridge is in "forwarding" state.) Experimentation shows that when STP is enabled on the bridge, step 2c takes *at least* ${DELAY} * 2 + 5 seconds, and sometimes as much as $DELAY * 2 + 6.5 seconds. But when no LINKDELAY is set, check_link_down() only waits for 5 seconds, so it will *always* fail. (this happens regardless of how much time passes between the first and second ifup invocations; also note that doing the ifups in the opposite order woul also always fail, since carrier would *never* go up on the bridge device if it had nothing attached). Since I'm fairly certain that people have been configuring bridges with a non-0 DELAY for many years and haven't previously encountered this problem, I would class this as a regression in the behavior of ifup that must be resolved.
Created attachment 956390 [details] patch against current upstream git of initscripts This patch causes ifup to wait at least this long for carrier on a bridge device when STP is enabled. This has caused all tests I've tried for differing values of STP, DELAY, and LINKDELAY to succeed. Note that although I filed this BZ against rawhide, the problem exists at least as far back as F20, as well as in RHEL7 and CentOS7 (I haven't checked RHEL6, but think that it *isn't* a problem there) so it should be backported to all of those releases.
initscripts-9.56.1-4.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/initscripts-9.56.1-4.fc21
initscripts-9.51-3.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/initscripts-9.51-3.fc20
initscripts-9.56.1-4.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
initscripts-9.51-3.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.