Bug 1789601
| Summary: | bonding configuration no longer works as documented | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Steve Milner <smilner> | |
| Component: | RHCOS | Assignee: | Micah Abbott <miabbott> | |
| Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 4.3.0 | CC: | aanjarle, asadawar, bbreard, bchardim, dahernan, dcain, dornelas, dustymabe, erich, imcleod, jligon, mharri, miabbott, mnguyen, nstielau, scuppett | |
| Target Milestone: | --- | Keywords: | Regression | |
| Target Release: | 4.3.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1791279 1792022 (view as bug list) | Environment: | ||
| Last Closed: | 2020-01-23 11:20:00 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1791279 | |||
| Bug Blocks: | 1186913, 1792022 | |||
|
Description
Steve Milner
2020-01-09 21:36:07 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1767771 is also connected to this. Seeing similar results as Steve with 202001072253.0 this week. Reboots before the December holiday were not necessary and provisioned machines with bonded interfaces passed through the /proc/cmdline interface to dracut functioned normally. While we continue to debug this issue, the workaround is to include a systemd unit file in the initial Ignition config that ups the bond interface on first boot.
```
$ cat bond-up.sh
#!/usr/bin/env bash
nmcli connection up bond0
touch /var/lib/bond-up
$ cat bond-up.service
[Unit]
Before=multi-user.target
After=network-online.target
ConditionPathExists=!/var/lib/bond-up
[Service]
Type=oneshot
ExecStart=/usr/local/bin/bond-up.sh
[Install]
WantedBy=multi-user.target
```
Example Ignition snippet:
{
"ignition": {
"config": {},
"security": {
"tls": {}
},
"timeouts": {},
"version": "2.2.0"
},
"networkd": {},
"passwd": {
"users": [
{
"groups": [
"sudo",
"wheel",
"adm",
"systemd-journal"
],
"name": "core",
"passwordHash": "$6$xXXXXXX...",
"sshAuthorizedKeys": [
"ssh-rsa AAAAB3NzaC1XXXXX..."
]
}
]
},
"storage": {
"files": [
{
"contents": {
"source": "data:text/plain;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaApubWNsaSBjb25uZWN0aW9uIHVwIGJvbmQwCnRvdWNoIC92YXIvbGliL2JvbmQtdXAK"
},
"filesystem": "root",
"mode": 509,
"path": "/usr/local/bin/bond-up.sh"
}
]
},
"systemd": {
"units": [
{
"contents": "[Unit]\nBefore=multi-user.target\nAfter=network-online.target\nConditionPathExists=!/var/lib/bond-up\n[Service]\nType=oneshot\nExecStart=/usr/local/bin/bond-up.sh\n[Install]\nWantedBy=multi-us
er.target\n",
"enabled": true,
"name": "bond-up.service"
}
]
}
}
```
```
We think the fix in BZ#1758091 is incomplete. After numerous tests, we believe we need to enhance the `coreos-teardown-initramfs-network.service` to properly down + remove the bonded interfaces while in the initramfs. The current code in that service will just skip the `/sys/class/net/bonding_masters` entry and will only do `ip link set bond0 down` as part of the teardown process. https://github.com/coreos/ignition-dracut/blob/spec2x/dracut/30ignition/coreos-teardown-initramfs-network.sh#L13 However, this still leaves the bonded interface defined under `/sys/class/net/` which seems to confuse NetworkManager in the real root. If we modify the service to properly remove the bonded interface definition under `/sys/class/net`, NetworkManager in the real root is able to properly start the interface. See https://github.com/torvalds/linux/blob/master/Documentation/networking/bonding.txt#L1427-L1436 for more info on removing bond configurations. PR merged The PR is included in `ignition-0.34.0-1.rhaos4.3.git92f874c.el8` which in turn is included in RHCOS 43.81.202001140253.0 We'll need to bump the boot images in the installer to properly include this fix. https://github.com/openshift/installer/pull/2914 houses the fix reference for the installer. Moving to modified per request. Verified on 43.81.202001141554.0 bonded interface comes up after setting it up using kargs Clearing all the NEEDINFOs as we have a fix that has been verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |