Description of problem:
https://bugzilla.redhat.com/show_bug.cgi?id=1758091 added support for bonding interfaces. However, in recent builds it seems that a regression has occurred and the instructions that worked previously no longer have the same result.
Currently, every time with recent 4.3 builds.
Steps to Reproduce:
1. Use a a recent 4.3 RHCOS
2. Follow instructions at https://bugzilla.redhat.com/show_bug.cgi?id=1758091#c20
3. Note the bond doesn't come up until the machine reboots
The bond is created, but no longer starts up right away. Instead, it starts up after rebooting.
The bond starts up immediately.
https://bugzilla.redhat.com/show_bug.cgi?id=1767771 is also connected to this.
Seeing similar results as Steve with 202001072253.0 this week. Reboots before the December holiday were not necessary and provisioned machines with bonded interfaces passed through the /proc/cmdline interface to dracut functioned normally.
While we continue to debug this issue, the workaround is to include a systemd unit file in the initial Ignition config that ups the bond interface on first boot.
$ cat bond-up.sh
nmcli connection up bond0
$ cat bond-up.service
Example Ignition snippet:
We think the fix in BZ#1758091 is incomplete.
After numerous tests, we believe we need to enhance the `coreos-teardown-initramfs-network.service` to properly down + remove the bonded interfaces while in the initramfs.
The current code in that service will just skip the `/sys/class/net/bonding_masters` entry and will only do `ip link set bond0 down` as part of the teardown process.
However, this still leaves the bonded interface defined under `/sys/class/net/` which seems to confuse NetworkManager in the real root.
If we modify the service to properly remove the bonded interface definition under `/sys/class/net`, NetworkManager in the real root is able to properly start the interface.
See https://github.com/torvalds/linux/blob/master/Documentation/networking/bonding.txt#L1427-L1436 for more info on removing bond configurations.
The PR is included in `ignition-0.34.0-1.rhaos4.3.git92f874c.el8` which in turn is included in RHCOS 43.81.202001140253.0
We'll need to bump the boot images in the installer to properly include this fix.
https://github.com/openshift/installer/pull/2914 houses the fix reference for the installer.
Moving to modified per request.
Verified on 43.81.202001141554.0
bonded interface comes up after setting it up using kargs
Clearing all the NEEDINFOs as we have a fix that has been verified
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.