Bug 1760262
Summary: | Bridge linux profile is not activated and stuck in connecting state after reboot | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Michael Burman <mburman> | ||||
Component: | General | Assignee: | Dominik Holler <dholler> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Michael Burman <mburman> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.40.0 | CC: | aloughla, atragler, bgalvani, bugs, danken, dholler, fgiudici, jcall, lrintel, mperina, rkhan, sukulkar, thaller | ||||
Target Milestone: | ovirt-4.4.0 | Flags: | mperina:
ovirt-4.4?
|
||||
Target Release: | 4.40.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-05-20 20:01:50 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1741792, 1756944, 1762028 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Michael Burman
2019-10-10 09:38:23 UTC
Two issues: For one, in the log you see multiple connection profiles. For example connection b967a965-4bd2-4fd8-99b2-b6d81d27cc7a from /etc/sysconfig/network-scripts/ifcfg-ens4f0. That is a regular ethernet profile, not a slave profile for a bridge (it has no "connection.master" property set). Overall, there is no available slave profile for device "ens4f0" that would be suitable slave profile. I don't know what the state was before reboot, but obviously, if you don't persist suitable profiles before rebooting, it's not gonna work afterwards. I would guess, that e47db561-27c5-4399-a553-39d694a6b932 was only-in-memory. Check the path with `nmcli -f all connection`, if it's /run/NetworkManager/system-conections, then it's in-memory and will be lost after reboot. A second problem is Oct 09 23:35:07 localhost.localdomain dhclient[1434]: DHCPDISCOVER on ens4f0 to 255.255.255.255 port 67 interval 7 (xid=0xbca830c) This is dracut/initrd, which configures the interface by running dhclient on it. Later, when NM starts, it sees that the interface "ens4f0" is already pre-configured by something else. This results in <info> [1570693052.3064] manager: (ens4f0): assume: will attempt to assume matching connection 'ens4f0' (b967a965-4bd2-4fd8-99b2-b6d81d27cc7a) (guessed) this means, that NetworkManager will try to gracefully take over the pre-configured device with the plain ethernet connection. Though, that will not work very well, and I don't think that is what is intended. This behaviour where initrd preconfigures the device with dhclient and passes it to later boot (NetworkManager) has many issues. In rhel-8.2, those will be solved by also running NetworkManager in initrd. Turned out, that dracut was overwriting /etc/sysconfig/network-scripts/ifcfg-ens4f0 file during boot. I don't think this is a bug in NetworkManager. What do you think? Can we close or reassign this? (In reply to Thomas Haller from comment #3) > Turned out, that dracut was overwriting > /etc/sysconfig/network-scripts/ifcfg-ens4f0 file during boot. Thomas, are you saying that dracut that we had a proper ifcfg-ens4f0 as a bridge slave, but dracut unilaterally overwritten it with something else? Would you elaborate so this bug can be moved to the offending component? > Thomas, are you saying that dracut that we had a proper ifcfg-ens4f0 as a bridge slave, but dracut unilaterally overwritten it with something else? That's what I am saying. > Would you elaborate so this bug can be moved to the offending component? The system is configured to do rd.neednet=1. Dracut does what it is requested to do. I don't know the offending comment. When installing the image, look at the resulting system for what is installed and configured. Then see where that configuration comes from. Created attachment 1629177 [details]
reproduction and workaround on plain RHEL 8.1 without RHV
Red Hat Knowledge Base (Solution) 3017441 describes the behavior,
the suggested workaround
echo 'omit_dracutmodules+="ifcfg"' >> /etc/dracut.conf.d/99-disable_ifcfg.conf
works for me.
The ifcfg files are protected by the fix, but I am interested to know if there are flows which results in an unexpected run time state. vdsm-4.40.0-141.gitb9d2120.el8ev.x86_64 that was shipped with 4.4.0-5 doesn't includes the desired fix. moving back to MODIFIED Verified on - vdsm-4.40.0-164.git38a19bb.el8ev.x86_64 with rhvm-4.4.0-0.9.master.el7.noarch This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |