Description of problem: https://bugzilla.redhat.com/show_bug.cgi?id=1758091 added support for bonding interfaces. However, in recent builds it seems that a regression has occurred and the instructions that worked previously no longer have the same result. How reproducible: Currently, every time with recent 4.3 builds. Steps to Reproduce: 1. Use a a recent 4.3 RHCOS 2. Follow instructions at https://bugzilla.redhat.com/show_bug.cgi?id=1758091#c20 3. Note the bond doesn't come up until the machine reboots Actual results: The bond is created, but no longer starts up right away. Instead, it starts up after rebooting. Expected results: The bond starts up immediately.
https://bugzilla.redhat.com/show_bug.cgi?id=1767771 is also connected to this.
Seeing similar results as Steve with 202001072253.0 this week. Reboots before the December holiday were not necessary and provisioned machines with bonded interfaces passed through the /proc/cmdline interface to dracut functioned normally.
While we continue to debug this issue, the workaround is to include a systemd unit file in the initial Ignition config that ups the bond interface on first boot. ``` $ cat bond-up.sh #!/usr/bin/env bash nmcli connection up bond0 touch /var/lib/bond-up $ cat bond-up.service [Unit] Before=multi-user.target After=network-online.target ConditionPathExists=!/var/lib/bond-up [Service] Type=oneshot ExecStart=/usr/local/bin/bond-up.sh [Install] WantedBy=multi-user.target ``` Example Ignition snippet: { "ignition": { "config": {}, "security": { "tls": {} }, "timeouts": {}, "version": "2.2.0" }, "networkd": {}, "passwd": { "users": [ { "groups": [ "sudo", "wheel", "adm", "systemd-journal" ], "name": "core", "passwordHash": "$6$xXXXXXX...", "sshAuthorizedKeys": [ "ssh-rsa AAAAB3NzaC1XXXXX..." ] } ] }, "storage": { "files": [ { "contents": { "source": "data:text/plain;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaApubWNsaSBjb25uZWN0aW9uIHVwIGJvbmQwCnRvdWNoIC92YXIvbGliL2JvbmQtdXAK" }, "filesystem": "root", "mode": 509, "path": "/usr/local/bin/bond-up.sh" } ] }, "systemd": { "units": [ { "contents": "[Unit]\nBefore=multi-user.target\nAfter=network-online.target\nConditionPathExists=!/var/lib/bond-up\n[Service]\nType=oneshot\nExecStart=/usr/local/bin/bond-up.sh\n[Install]\nWantedBy=multi-us er.target\n", "enabled": true, "name": "bond-up.service" } ] } } ``` ```
We think the fix in BZ#1758091 is incomplete. After numerous tests, we believe we need to enhance the `coreos-teardown-initramfs-network.service` to properly down + remove the bonded interfaces while in the initramfs. The current code in that service will just skip the `/sys/class/net/bonding_masters` entry and will only do `ip link set bond0 down` as part of the teardown process. https://github.com/coreos/ignition-dracut/blob/spec2x/dracut/30ignition/coreos-teardown-initramfs-network.sh#L13 However, this still leaves the bonded interface defined under `/sys/class/net/` which seems to confuse NetworkManager in the real root. If we modify the service to properly remove the bonded interface definition under `/sys/class/net`, NetworkManager in the real root is able to properly start the interface. See https://github.com/torvalds/linux/blob/master/Documentation/networking/bonding.txt#L1427-L1436 for more info on removing bond configurations.
PR merged
The PR is included in `ignition-0.34.0-1.rhaos4.3.git92f874c.el8` which in turn is included in RHCOS 43.81.202001140253.0 We'll need to bump the boot images in the installer to properly include this fix.
https://github.com/openshift/installer/pull/2914 houses the fix reference for the installer.
Moving to modified per request.
Verified on 43.81.202001141554.0 bonded interface comes up after setting it up using kargs
Clearing all the NEEDINFOs as we have a fix that has been verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062