Bug 1287726
Summary: | NIC cannot start after reboot. Manually typing systemctl start network after login can initialize network | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | edw.ekei | |
Component: | initscripts | Assignee: | initscripts Maintenance Team <initscripts-maint-list> | |
Status: | CLOSED ERRATA | QA Contact: | Leos Pol <lpol> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 7.2 | CC: | asadawar, edw.ekei, jscotka, lnykryn, lpol, ptalbert, redhat-bugzilla, robert.scheck, shane.seymour | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | initscripts-9.49.31-1.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1377133 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-04 06:43:04 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1289485, 1313485, 1377133 | |||
Attachments: |
Description
edw.ekei
2015-12-02 14:31:03 UTC
reassigning to initscripts, which owns ifup-eth Quick fix here is to delete the HWADDR and add DEVTIMEOUT=5 to your ifcfg file. It is a known issue of hyper-v that the device will appear after the network script is run during the boot. Can you please try that? "Device has different MAC address than expected" seems to be misleading here, I will look at it and try to fix it. I have commented out HWADDR and added DEVTIMEOUT=5. Still, network interfaces are not initialized during boot. I have tried increasing DEVTIMEOUT, up to 100. This is the error after rebooting (no HWADDR): sudo journalctl -xe -u network network[453]: Bringing up interface eth0: ERROR: [/etc/sysconfig/network-scripts/ifup-eth] Device does not seem to be present, delaying initialization. network[453]: [FAILED] network[453]: Bringing up interface eth1: ERROR: [/etc/sysconfig/network-scripts/ifup-eth] Device does not seem to be present, delaying initialization. network[453]: [FAILED] This is the first NIC configuration; the second is exactly the same, using a public IP. I have tried combinations of DEVICE, NAME, UUID, HWADDR. Most of these combinations initialize interfaces after logging in, none during boot. /etc/sysconfig/network-scripts/ifcfg-eth0 TYPE=Ethernet BOOTPROTO=none IPADDR=172.16.1.57 PREFIX=24 #HWADDR=00:15:5d:31:c0:04 ONBOOT=yes IPV6INIT=no DEVTIMEOUT=100 DEVICE=eth0 #NAME=eth0 #UUID=84eb62ed-1727-4c2e-bc66-be65a1a56a71 Since you mention about Hyper-V issue, I'd like to comment that in RHEL 7.1 there was no such problem; using an older kernel after updating to RHEL 7.2 produced mixed results as I have mentioned above. Also in case it helps: The VM is a generation 2, UEFI on Hyper-V 2012 R2. In RHEL 7-7.1 the GRUB loader was counting seconds extremely fast. It is a known bug and MS suggests to set the GRUB delay in thousands (10000 seconds, instead of 10) in order to be able to select a kernel or other boot option. Now in RHEL 7.2 this has been corrected, GRUB counts seconds properly. Is there any case that the two are connected? Could the correction to GRUB timing produced delayed NIC initialization? (In reply to Lukáš Nykrýn from comment #2) > Quick fix here is to delete the HWADDR and add DEVTIMEOUT=5 to your ifcfg > file. > It is a known issue of hyper-v that the device will appear after the network > script is run during the boot. > Can you please try that? > > "Device has different MAC address than expected" seems to be misleading > here, I will look at it and try to fix it. I have commented out HWADDR and added DEVTIMEOUT=5. Still, network interfaces are not initialized during boot. I have tried increasing DEVTIMEOUT, up to 100. Sorry for not pressing "Reply", I not very keen on the use of fora. You can see my hole message as a new comment, thank you. Can you add exec 30>/dev/kmsg BASH_XTRACEFD=30 set -x to the end of /etc/init.d/functions, reboot the machine and send me output of dmesg? Created attachment 1103875 [details]
VM's dmesg after applying a change to /etc/init.d/functions, as requested
(In reply to Lukáš Nykrýn from comment #5) > Can you add > > exec 30>/dev/kmsg > BASH_XTRACEFD=30 > set -x > > to the end of /etc/init.d/functions, reboot the machine and send me output > of dmesg? Ok, I have uploaded the dmesg as a text file attachment. > Ok, I have uploaded the dmesg as a text file attachment.
I have removed the public IPs, wherever you see PublicIP_replaced_x.x.x in the file is a public IP or gateway etc setting.
So the fix should be easy, we just need to backport https://git.fedorahosted.org/cgit/initscripts.git/commit/?id=1f230a3d2e2733e30577c91645005801ab2c0f40 to rhel. (In reply to Lukáš Nykrýn from comment #9) > So the fix should be easy, we just need to backport > https://git.fedorahosted.org/cgit/initscripts.git/commit/ > ?id=1f230a3d2e2733e30577c91645005801ab2c0f40 > to rhel. Ok, should I try to copy the /etc/sysconfig/network-scripts/network-functions file from initscripts-1f230a3d2e2733e30577c91645005801ab2c0f40.zip to my server to test it? Created attachment 1103964 [details]
network-functions
I have attached the patched /etc/sysconfig/network-scripts/network-functions, so you can try to replace just that one file.
And also please keep the DEVTIMEOUT in your ifcfg file. I replaced the /etc/sysconfig/network-scripts/network-functions file, DEVTIMEOUT=15 (tried other values, less that 15, NICs don't start, more that 15, no difference than 15). Odd behaviour: After login, first execution of 'systemctl status network' command shows only loopback. Some seconds later, re-issuing the command, I can see only eth0, about a minute later I can see eth1 also! Services cannot initialize properly because of this delay. Is it relevant to DEVTIMEOUT, that NICs don't start concurrently? I'll try to update Hyper-V host also. Unfortunately it'll need reboot and since it hosts other VMs also it's not easy to proceed. lets do one more round of exec 30>/dev/kmsg BASH_XTRACEFD=30 set -x Can you send me again the output of dmesg after you do all of those things? Created attachment 1104289 [details]
dmesg after copying patched /etc/sysconfig/network-scripts/network-functions
I uploaded the dmesg you asked for. It's from a test server. I decided to try it on my production server too; The NICs initialize. sshd, which binds to eth0 only, starts properly. named which should bind to both eth0 and eth1, binds only to localhost. I can send you dmesg from the production server also if you like. Thank you. Hm from that log, it looks like it should work. [ 22.912864] + ip link set dev eth0 up Even it is weird that it take 13 seconds for the device to appear. Maybe can you try to increase udev.children-max= http://www.freedesktop.org/software/systemd/man/systemd-udevd.service.html Or maybe can you try one unsupported solution? Since 7.2 there should be systemd-networkd in optional repository. Could you try it instead of network-scripts? Basic configuration is really easy, you just need something like: /etc/systemd/network/80-dhcp.network: [Match] Name=eth* [Network] DHCP=yes I sorted out some things and resolved the problem using a workabout. First of all, in my less-than-minimal setup I have dhcp-common removed for static-IP-only VMs. As a dependency, dracut-network is also removed. I re-installed them, dracut -f, and the network interfaces could start without DEVTIMEOUT=15 (needed before). After this change, my test machine's sshd and named could also start properly. On the contrary, my production machine's sshd and named did not bind properly (only to localhost) although both NICs where up and running. I'm afraid due to other alterations. So as a workabout to my production machine, I made this file+directory: For named: /lib/systemd/system/named.service.d/Just_A_name.conf For sshd: /lib/systemd/system/sshd.service.d/Just_A_name.conf Same content: [Unit] After=network-online.target Requires=network-online.target And problem solved, sshd and named start after network. It's a very peculiar scenario (UEFI, dracut-network removed), you can close the bug if you like. Thank you. We experienced the same issue today, found bug #1180837 (which is actually exactly the same, just for a Fedora) and the patch mentioned in comment #9 solves the issue for us perfectly. I now cross-filed case #01694963 on the Red Hat customer portal to speed up things, as we need the patch ASAP in conjunction with all RHEL 7.x VMs under Microsoft Hyper-V (when not running more fancy things like a NetworkManager on a server *sigh*). I am not sure what else I can tell you to that. This is scheduled for 7.3 now. https://git.fedorahosted.org/cgit/initscripts.git/commit/?h=rhel7-branch&id=da83c4e174991b2dedf6ce7f8c490f2d1fbc1d57 If you need this to be fixed in z-stream you need to ask through customer portal. Verified by bz1339648. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2456.html |