Bug 700975

Summary: Eth dev assignment not consistent across boots when using bonded network configurations.
Product: Red Hat Enterprise Linux 6 Reporter: pat ray <patray>
Component: udevAssignee: Harald Hoyer <harald>
Status: CLOSED INSUFFICIENT_DATA QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: ddumas, msekleta, rvokal
Target Milestone: rcFlags: pknirsch: needinfo? (patray)
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 15:18:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description pat ray 2011-04-30 05:25:02 UTC
User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16

I'm running RHEL6 on a Dell server with one 4-port on-board NIC and an add-on 4-port NIC. On initial boot and subsequent boots with normal (unbonded) network configurations, the ethX to MAC assignments stick - the 70-persistent-net.rules don't change, and ip addr show returns the same results after every boot.

At this point, my 70-persistent-net.rules look like this:

# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:24:e8:68:c0:ad", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:24:e8:68:c0:af", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:24:e8:68:c0:b1", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:24:e8:68:c0:b3", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1b:21:3b:6e:28", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1b:21:3b:6e:29", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1b:21:3b:6e:2c", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1b:21:3b:6e:2d", ATTR{type}=="1", KERNEL=="eth*", NAME="eth7"


All of the ifcfg-ethX rules look like this:

DEVICE="eth0"
BOOTPROTO="static"
DNS1="172.30.0.202"
GATEWAY="172.30.28.1"
HWADDR="00:24:E8:68:C0:AD"
IPADDR="172.30.31.223"
NETMASK="255.255.252.0"
NM_CONTROLLED="yes"
ONBOOT="yes"

varying, of course, by DEVICE and HWADDR and the various network addresses and netmask.

I configure the devices for bonded networking (balance-alb, if that matters), on two separate bonds - eth0,1,4,5 on bond1 and 2,3,6,7 on bond0. Here are the ifcfg-ethX and ifcfg-bondX files.

::::::::::::::
ifcfg-eth0
::::::::::::::
DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
HWADDR=00:24:e8:68:c0:ad
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth1
::::::::::::::
DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
HWADDR=00:24:e8:68:c0:af
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth2
::::::::::::::
DEVICE=eth2
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
HWADDR=00:24:e8:68:c0:b1
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth3
::::::::::::::
DEVICE=eth3
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
HWADDR=00:24:e8:68:c0:b3
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth4
::::::::::::::
DEVICE=eth4
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
HWADDR=00:1b:21:3b:6e:28
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth5
::::::::::::::
DEVICE=eth5
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
HWADDR=00:1b:21:3b:6e:29
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth6
::::::::::::::
DEVICE=eth6
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
HWADDR=00:1b:21:3b:6e:2c
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-eth7
::::::::::::::
DEVICE=eth7
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
HWADDR=00:1b:21:3b:6e:2d
MTU=1500
NM_CONTROLLED=no

::::::::::::::
ifcfg-bond0
::::::::::::::
# Private bonded interface
DEVICE=bond0
IPADDR=192.168.0.5
NETMASK=255.255.0.0
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
MTU=1500
BONDING_OPTS="mode=balance-alb miimon=100 updelay=500"
NM_CONTROLLED=no


::::::::::::::
ifcfg-bond1
::::::::::::::
# Public bonded interface
DEVICE=bond1
IPADDR=172.30.31.223
NETMASK=255.255.252.0
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
MTU=1500
BONDING_OPTS="mode=balance-alb miimon=100 updelay=500"
NM_CONTROLLED=no


GATEWAY=172.30.28.1

Once I reboot, the device assigments are scrambled.

70-persistent-net.rules is rewritten, rarely in a way that leaves a running network. A typical result after reboot is:

[root@csn3 ~]# cat /etc/udev/rules.d/70-persistent-net.rules
# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:ad", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth0"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:b1", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth2"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:af", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth1"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:2c", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth6"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:2d", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth7"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:b3", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth3"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:29", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth5"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:28", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth4"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:ad", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth4"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:28", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth0"

# PCI device 0x8086:0x10d6 (igb) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1b:21:3b:6e:2c", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth2"

# PCI device 0x14e4:0x1639 (bnx2) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:24:e8:68:c0:b1", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth6"

Here's the ip addr show output:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1
state UP qlen 1000
    link/ether 00:24:e8:68:c0:ad brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1
state UP qlen 1000
    link/ether 00:1b:21:3b:6e:29 brd ff:ff:ff:ff:ff:ff
4: eth6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0
state UP qlen 1000
    link/ether 00:24:e8:68:c0:b1 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0
state UP qlen 1000
    link/ether 00:24:e8:68:c0:b3 brd ff:ff:ff:ff:ff:ff
6: eth0-eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:3b:6e:28 brd ff:ff:ff:ff:ff:ff
7: eth5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1
state UP qlen 1000
    link/ether 00:24:e8:68:c0:af brd ff:ff:ff:ff:ff:ff
8: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0
state UP qlen 1000
    link/ether 00:1b:21:3b:6e:2c brd ff:ff:ff:ff:ff:ff
9: eth7: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0
state UP qlen 1000
    link/ether 00:1b:21:3b:6e:2d brd ff:ff:ff:ff:ff:ff
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
    link/ether 00:1b:21:3b:6e:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.5/16 brd 192.168.255.255 scope global bond0
    inet 192.168.0.3/16 brd 192.168.255.255 scope global secondary bond0:1
    inet 192.168.0.6/16 brd 192.168.255.255 scope global secondary bond0:2
    inet 192.168.0.7/16 brd 192.168.255.255 scope global secondary bond0:3
    inet6 fe80::21b:21ff:fe3b:6e2c/64 scope link
       valid_lft forever preferred_lft forever
11: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
    link/ether 00:24:e8:68:c0:af brd ff:ff:ff:ff:ff:ff
    inet 172.30.31.223/22 brd 172.30.31.255 scope global bond1
    inet 172.30.31.232/22 brd 172.30.31.255 scope global secondary bond1:1
    inet6 fe80::224:e8ff:fe68:c0af/64 scope link
       valid_lft forever preferred_lft forever

The MAC on eth0 and eth4 conflict in this case, leaving one of the links on bond1 down.

I'm pretty sure the problem is in /lib/udev/write_net_rules. This function 

find_all_ifcfg() {
    local links=$1
    local
__sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'

    files=$(echo /etc/sysconfig/network-scripts/ifcfg-* \
        | LC_ALL=C sed -e "$__sed_discard_ignored_files")
    for i in $files; do
        ( 
            . $i
            [ -n "$HWADDR" ] && [ "${links%%[ \[\]0-9]*}" = "${DEVICE%%[
\[\]0-9]*}" ] && echo $DEVICE
        )
    done
}


stops enumerating ifcfg-* files when it finds a an ifcfg-* file with a DEVICE that begins with something other then 'eth', which is the case for my ifcfg-bondX files. (It's also the case for files that don't have a HWADDR, which is true for ifcfg-bondX files.)

I made a simple patch to write_net_rules, replacing find_all_ifcfg() with this code

find_all_ifcfg() {
    local links=$1
    local
__sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'

    files=$(echo /etc/sysconfig/network-scripts/ifcfg-* \
        | LC_ALL=C sed -e "$__sed_discard_ignored_files")
    for i in $files; do
        ( 
            # We probably need to clean out $HWADDR here so that we don't get
any accidental overwrites
            . $i
            if [[ -n "$HWADDR" ] && [ "${links%%[ \[\]0-9]*}" = "${DEVICE%%[
\[\]0-9]*}" ]]; then
                echo $DEVICE
            fi
        )
    done
}

and it now works like a champ. My 70-persistent-net.rules stick, and my bonds work.

Reproducible: Always

Steps to Reproduce:
See above:
1) Boot
2) Set up bonding
3) Reboot

Comment 2 RHEL Program Management 2011-04-30 06:00:43 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Harald Hoyer 2011-05-02 11:12:18 UTC
Hmm, normally /lib/udev/rules.d/60-net.rules kicks in and renames your network interfaces, if it finds correct ifcfg files with HWADDR.

70-persistent-net.rules is only written for _unconfigured_ network interfaces.

What is the output of:

# for i in $(seq 0 9); do echo "eth${i}:"; INTERFACE=eth$i /lib/udev/rename_device;done

If in doubt please just remove 70-persistent-net.rules.

I don't think your patch changes any behaviour of find_all_ifcfg().

Comment 7 Denise Dumas 2012-01-05 16:14:03 UTC
Per Harald, changing this to cond nak design - we still needinfo from the reporter and have no clear diagnosis.

Comment 8 RHEL Program Management 2012-05-03 04:41:07 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 9 Harald Hoyer 2015-10-13 09:45:08 UTC
any updates on this?

Comment 10 Michal Sekletar 2015-11-19 15:18:55 UTC
I tried to reproduce this bug. I've setup two bonds as described here, each over 4 ethernet interfaces. I've used kernel names. I also configured the system in a way such that orig. eth0 should be named eth1, and eth2 should be named eth3, etc...After configuration I've stored currently generated version of 70-persistent-net.rules and rebooted the system 20 times. Every time I've checked whether my saved copy of rules is the same as one in /etc/udev/rules.d. It was everytime, bond interfaces worked everytime, i.e. I used one of them to ssh in.

Since there is no additional info provided by reporter and I couldn't reproduce bug myself, moving this to closed.