| Summary: | NetworkManager brings down ethX configured using iBFT over software iSCSI | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Nilesh Javali <nilesh.javali> | ||||||||||||||||||
| Component: | NetworkManager | Assignee: | Lubomir Rintel <lrintel> | ||||||||||||||||||
| Status: | CLOSED WONTFIX | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||||
| Priority: | high | ||||||||||||||||||||
| Version: | 6.8 | CC: | aloughla, andrew.vasquez, atragler, bgalvani, cdupuis, girisha.davanageri, GR-Linux-NIC-Dev, jkachuck, joseph.szczypek, karen.skweres, linda.knippers, lrintel, mleitner, nigel.croxon, nilesh.javali, rkhan, shyam.sundar, sukulkar, thaller, tom.vaden, trinh.dao, Yuval.Mintz | ||||||||||||||||||
| Target Milestone: | rc | ||||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||||
| OS: | Linux | ||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||
| Last Closed: | 2017-04-05 15:06:17 UTC | Type: | Bug | ||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
| Bug Depends On: | |||||||||||||||||||||
| Bug Blocks: | 1383344, 1425546, 1438054 | ||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||
> 2. The NetworkManager seems to do some configuration change which causes an
> internal reload even though ifcfg-eth0 clearly states “NM_CONTROLLED=no”.
bnx2x has several driver flows that would cause internal-reload [Changing MTU, disabling LRO, etc.]. Those need to be prevented in BFS scenarios, as they would cause filesystem to disconnect.
So just to focus the effort - the issue here is not why we're seeing the internal-reload of eth0, but rather why can't we prevent the network manager from claiming ownership over eth0.
Any update on this? Aniss, please update Priority to high. We are unable to edit it somehow. *** Bug 1372411 has been marked as a duplicate of this bug. *** RH, any update on this? The team has been tied up with the 7.3 release.. Please standby and we will provide feedback this week. Hi, I installed RHEL 6.8 on a iSCSI root using iBFT, rebooted and didn't experience any issue, so I guess the problem is hardware-dependent. I see that in the /etc/sysconfig/network-scripts/ifcfg-eth0 file contains: DEVICE="eth0" BOOTPROTO="ibft" NM_CONTROLLED="no" ... With such configuration, the connection will be handled by the ibft NM plugin, which will fetch interface parameters from the the firmware table. The ibft plugin doesn't handle the NM_CONTROLLED key in ifcfg files and thus every time NM starts it will reconfigure the interface. In normal conditions this doesn't cause any problems, as NM only disconnects the device and reconnects it immediately re-applying the same parameters set by kernel/ramdisk. I suspect that in your case the device doesn't come up again after NM reconfigures it. To confirm this, it would be useful to have more logs. Could you please attach the output of the following command: service NetworkManager stop; /usr/sbin/NetworkManager --no-daemon --log-level=DEBUG and then the 'dmesg' output? A possible workaround is to tell NM that eth0 should never be managed by adding a section in /etc/NetworkManager/NetworkManager.conf containing: [keyfile] unmanaged-devices=mac:xx:xx:xx:xx:xx:xx where xx:xx:xx:xx:xx:xx is the MAC address of eth0. After a reboot of the system, NM should ignore altogether eth0. We have asked our testing QA to collect the data. Will update once debug data is available. We were able to reproduce the issue and boot in single user mode to collect the needed data. Unfortunately, the dmesg does not capture the output of "service NetworkManager stop; /usr/sbin/NetworkManager --no-daemon --log-level=DEBUG". Is there any other place where the logs get captured. I could see the logs on the screen but unable to capture them. The workaround of adding section under /etc/NetworkManager/NetworkManager.conf also did not work. The system still fails to boot. (In reply to Nilesh Javali from comment #11) > We were able to reproduce the issue and boot in single user mode to collect > the needed data. Unfortunately, the dmesg does not capture the output of > "service NetworkManager stop; /usr/sbin/NetworkManager --no-daemon > --log-level=DEBUG". Is there any other place where the logs get captured. I > could see the logs on the screen but unable to capture them. You can redirect logs to a file in the following way: # /usr/sbin/NetworkManager --no-daemon --log-level=DEBUG 2> /root/nm.log or, if you prefer, add the following snippet to /etc/NetworkManager/NetworkManager.conf: [logging] level=DEBUG and then run: # service NetworkManager restart Logs will be saved to /var/log/messages. Thanks. Created attachment 1213864 [details]
NetworkManager logs
Unfortunately the new logs aren't useful as NM can't start in single mode due to missing dependent services. Can you please try to boot in normal (non-single) mode (which should fail), then boot in single mode and attach the fragment of /var/log/messages related to previous boot? Also, you can do an interactive boot [1]: press ESC and when you see services starting, press 'i' to enter interactive mode; at that point, press 'n' when the system asks whether NetworkManager should be started, or 'y' for all other services. [1] https://access.redhat.com/solutions/25950 After the system has booted, start NM with: /usr/sbin/NetworkManager --no-daemon --log-level=DEBUG 2>&1 | tee /root/nm.log to send the output to both console and a file. This should catch what NM is doing and why network connectivity is disrupted. Thanks! RH, any new update on this bug? RHEL6.9 Alpha is releasing next month. Hi Trinh, Would it be possible to give us the requested information by following the steps provided in comment#14? Thanks! Sushil yes, I am asking our HPE engineer for info. Girisha, can you please answer comment 14? Hi.. thanks for this.. We are coming very close to the end of the lifecycle of the release and will be helpful if you are able to get us this info. as soon as possible. Thanks! Sushil Created attachment 1225453 [details]
nmlogs
find the attached requested logs
Created attachment 1225475 [details]
HPE screenshot
Created attachment 1225476 [details]
HPE messages
Created attachment 1225477 [details]
HPE nm
Girisha, thank you for adding the files. The activation of the DHCP connection fails because NM can't create a configuration file for dhclient, apparently because the root file system is mounted read-only: <warn> (eth0): error creating dhclient configuration: Failed to create file '/var/run/nm-dhclient-eth0.conf.ES0ERY': Read-only file system I can't tell the reason why the filesystem is read-only, which probably does not depend on NM; anyway it seems that the workaround suggested in comment 9 was not in place or not effective. Since iBFT provides configuration for both the interfaces eth0 and ibft0: NetworkManager[5189]: ibft: read connection 'iBFT ibft0' NetworkManager[5189]: ibft: read connection 'iBFT eth0' please, ensure that the /etc/NetworkManager/NetworkManager.conf contains [keyfile] unmanaged-devices=mac:xx:xx:xx:xx:xx:xx;mac:yy:yy:yy:yy:yy:yy where xx:xx:xx:xx:xx:xx and yy:yy:yy:yy:yy:yy are the MAC addresses of eth0 and ibft0. If possible, please provide the content of NetworkManager.conf, the output of 'ip link' and the logs with the workaround in place. Thanks. Created attachment 1225942 [details]
New_RHEL_NM_Logs.zip
unmanaged-devices=mac:5C:B9:01:C5:50:C0;mac:5C:B9:01:C5:50:C4 Ok, this seems correct, with the exception that MAC addresses must be lower-case; I didn't notice in the NetworkManager.conf man page there was this constraint. Could you please try again with: unmanaged-devices=mac:5c:b9:01:c5:50:c0;mac:5c:b9:01:c5:50:c4 This time it should work! Girisha, Can you run test on comment 27? Girisha, I need answer for Comment 27. Girisha, update please? (In reply to Beniamino Galvani from comment #27) > unmanaged-devices=mac:5C:B9:01:C5:50:C0;mac:5C:B9:01:C5:50:C4 > > Ok, this seems correct, with the exception that MAC addresses must be > lower-case; I didn't notice in the NetworkManager.conf man page there was > this constraint. > > Could you please try again with: > > unmanaged-devices=mac:5c:b9:01:c5:50:c0;mac:5c:b9:01:c5:50:c4 > > This time it should work! Girisha comment: After changing the mac addresses from upper case to lower case the OS boots successfully. With the workaround the OS boots successfully. Created attachment 1242543 [details]
logs after changing the macadrres to lowercase
Please find the attached logs after changing the mac addresses to lowercase.
RH, any new update on this bug? Hello, Due to where we are in the RHEL 6.9 release. This will not make RHEL 6.9. This is now requested for RHEL 6.10. Thank You Joe Kachuck Hello, This will not be fixed in RHEL 6.10. Please confirm if you would like the workaround in comment 27 in a kbase? Thank You Joe Kachuck Hello, RHEL 6 has entered Phase 3. In phase 3 only Critical impact Security Advisories and selected Urgent Priority Bug Fix Advisories will be accepted. https://access.redhat.com/support/policy/updates/errata At current this BZ does not meet these requirements. I am closing this BZ as WONTFIX. Please reopen if this fix is required for RHEL 6. If so please also provide a justification for this fix. Thank You Joe Kachuck |
Created attachment 1197848 [details] Different NetworkManager logs in single user mode and boot failure serial logs Description of problem: The RHEL 6.8 installation is successful using an iSCSI boot device over pure software iSCSI (no hardware assist or offload). But the OS fails to boot as the NetworkManager, upon switching the root, brings the ethX interface down causing iSCSI connection loss. Version-Release number of selected component (if applicable): RHEL 6.8 GA How reproducible: Consistently Steps to Reproduce: 1. Install RHEL 6.8 GA on iSCSI boot device over pure software iSCSI 2. Installation is successful 3. Boot the installed OS and observe the iSCSI connection loss and OS fails to boot. Actual results: The OS fails to boot. Expected results: The OS should boot successfully Additional info: 1. The OS boots in single user mode, where issuing command "service NetworkManager restart" causes network connectivity loss. 2. The NetworkManager seems to do some configuration change which causes an internal reload even though ifcfg-eth0 clearly states “NM_CONTROLLED=no”.