Description of problem: e1000 NIC didn't work properly under F7. It seems that there is two major problems with it: 1. It can hardly get IP addres from DHCP. 2. If unplug the network wire from NIC, it causes the system hang. The kernel I'm current using is kernel-2.6.21-1.3194.fc7 on i686 platform. How reproducible: This problem is always reproducible on my IBM ThinkPad T42 2373-NTH. Steps to Reproduce: 1. Start Fedora 7 normally with network configured as get IP from DHCP. 2. It may not get IP address on boot up. This is the first problem. 3. If it get IP on boot up. After boot up, open an terminal and run "service network restart". It cannot get IP address at this time. 4. Unplug the network wire from the NIC, the system hangs, it will not respond to keyboard anymore. You have to shutdonw the power and reboot your machine. Actual results: 1. Cannot get IP on boot up or restaring network service. 2. System hangs when unpluging the network wire. Expected results: 1. It can get IP correctly. 2. System won't hang when unpluging the wire. Additional info: If I replace the kernel with RHEL5's kernel-2.6.18-8.el5 or FC6's kernel-2.6.20-1.2952.fc6 this problem no longer exists. If I rebuild the F7 kernel from source, this problem still exists. It seems that this bug is similar to http://www.uwsg.iu.edu/hypermail/linux/kernel/0705.3/0086.html But I cannot confirm this.
I am using 2.6.21-1.3194.fc7 as well on my Thinkpad T40 and am also running into this problem. But also unplugging my nic and then moving my laptop to another network I also run into the problem, even when it is turned off.
I have a Lenovo T60, same problem, only with 1.3194.fc7. 1.3189.fc7 doesn not suffer from this issue. 1) eth0 does not get an IP during boot time 2) ifup eth0 powers up the if, but does not acquire the IP 3) dhclient eth0 and ifconfig eth0 shows that there are packets leaving the if, unfortunately the system stops responding to the keyboard in about 2 minutes and becomes totally unresponsive if trying to reboot or poweroff I see nothing interesting in messages nor on the terminal.
Created attachment 156021 [details] output of `lspci -vvv'
I also experienced this issue on Dell Poweredge 1800, using Intel 82541GI gige controller. For me the problem appears when attempting to re-configure the IP following a successful initialisation. For example, when kickstarting the new build the box was able to intially acquire an IP and successfully download the kickstart configuration, which included a dhcp network statement; once this was applied the kickstart would hang as it was unable to mount the nfs on the server to connect to the installation files as it re-configures the interface before proceeding. (dhcpd and nfsd run on the same server). At both points the dhcpd logs that it offers the client the correct IP but on the second attempt the client does not reply with a dhcp request packet and eventually becomes unresponsive at the console. Workaround for this was to edit the kickstart config to use a static IP and the build completes successfully. Following firstboot I re-configured the NIC back to use dhcp and restarted the network service - the NIC did not re- acquire an IP and the system became unresponsive. On all subsequent boots the system always acquires an IP via dhcp during boot but will fail post-boot when the NIC is brought down then back up.
I've been playing with my laptop this morning trying to discover the exact conditions in which this happened and I've discovered some interesting things. These system hangs and failures of the NIC only happen when the link is unplugged while the box has power and maybe only when it's booted. I was successfully able to turn off my laptop, unplug the power and then unplug the link then replug in the link and the power (in that order) and still have my NIC work. I then attempted to bootup without link, shutdown, bootup with link and it worked fine. I haven't tried plugging the link into a booted system or unplugging while the NIC has power.
The e1000 problem was not the only one after a fresh f7 install: - suspend didn't work - postfix didn't stop after network mess and so system didn't shutdown My system is a Lenovo ThinkPad T60. I replaced the kernel this morning with 2.6.21-1.3200.fc8 from development and everything is working fine again. BTW: Why is %dist tag still named "fc"?
I have an x32, same problem with the integrated ethernet using e1000. I'm using 2.6.20-1.2948.fc6 in order to make the network work. No messages in /var/log/messages when it hangs.
Similar trouble on an Intel OEM server board (with on board e1000) SE7210TP1E. 1. Can't always get an IP from DHCP server (usually can't). 2. Ethernet is intermittent. FTP & SSH sessions get randomly dropped. 3. Can't download updates. 4. System hangs when shutting down. 5. Other inconsistent weirdness. Was a fresh F7 install.
The same issue happened on my Benq S72G-110. When I "Active" the network, the computer hangs. After replacing kernel-2.6.21-1.3194.fc7 with kernel-2.6.20-1.2952.fc6, the issue seems to be solved as the issue doesn't happen.
Same result with the VIA VT6102 Rhine-II rev 7c network card. Heavy network traffic = hang. Kernel 3212.i686 is better, but still hangs.
Same symptoms on a Lenovo X60 running kernel 3194, x86_64. I don't normally get an address from DHCP when I plug in the wire, though I seem to be ok if I boot the machine with the wire plugged in or (at least sometimes) if it's the first network connection made by NetworkManager even if I don't have the wire plugged in when the machine boots. If I remove the wire and reinsert it later, the machine locks up completely and I have to turn off the power.
Switching to kernel-2.6.21-1.3207.fc8.i686 (http://koji.fedoraproject.org/koji/buildinfo?buildID=8063) solved this issue for me: DHCP at boot time now works fine, ifup/ifdown and 'modprobe -r e1000' no longer block indefinitely.
Seeing similar on T60 IBM/Lenovo running the F7 GA What is see. nic plugged into boot single user modprobe/rmmod e1000 aqs often as you like NO problems FIRST time service network start, no problem, NIC gets IP from DHCP bootps request The service network stop is Ok then ANY command that touches eth0 i.e. ifconfig eth0, ifdown/up eth0 is a FULL SYSTEM HANG.... need to power off. Thanks,
Several patches have made it upstream to fix some netif_poll_* issues. Once they are available for FC users it should be all OK.
(In reply to comment #14) > Several patches have made it upstream to fix some netif_poll_* issues. Once they > are available for FC users it should be all OK. I tested the 3212 FC Kernel where, which has the fix: http://koji.fedoraproject.org/koji/buildinfo?buildID=7769 It was better but still hard locked. I had to drop back to 3143, which had an older e1000 driver. I do not believe the netif_poll fin e1000_open fixes the issue. Well, unless implemented wrong....
As I said, there are "several" patches upstream. Please test the latest git tree from Linus to make sure you have all of the latest patches. I personally do not track fedora kernels and do not know which patches are merged in there.
There are more fixes in kernel 3218.
kernel 3218 or later can be found at http://people.redhat.com/davej/kernels/Fedora/fc7/ Would somenw with this bug please test.
I'm having this problem, and I've just installed kernel-2.6.21-1.3219.fc7.i686.rpm, I'll let you know if it fixes this.
Ok, it failed. Simply doing ifdown eth0/ifup eth0 demonstrates this bug.
This one is a real showstopper for me and stops me from being able to do kickstart installations. I am seeing the DHCP timeout issue in every system that requires the e1000 driver so far. I will try and gather as much information as I can to help solve this problem next week. This problem also arose previously for me when I rebuilt the Fedora Core 6 kernel with the same version e1000 driver that is in the Fedora 7 default kernel.
In addition to my previous comment I can confirm that this issue is on both x86_64 and i386 architectures.
I tried this kernel http://koji.fedoraproject.org/koji/buildinfo?buildID=8282 And found that [1] I could get a DHCP address every time [2] That I could remove/add the physical ethernet cabel during a ping and NetworkManager WOULD now get my link back BUT [a] very variable round trip latencies pinging a one hop router, 2-500ms where its normally 2ms [b] and ifconfig eth0 down/ifconfig eth0 up still hangs everything. Thanks
I am running kernel-2.6.21-1.3219.fc7.x86_64 from koji now (Lenovo X60) with NetworkManager. It still doesn't notice if the network cable is removed from the machine (and so doesn't try to start up the wireless), The ping latencies seem very high and quite variable, ranging from 7ms to 309ms on successive packets. If I click on the wired network in NetworkManager, I do get a popup saying the network is disconnected but it then quickly reconnects correctly. If I disable networking in NetworkManager, I am unable to re-enable it. And doing "service NetworkManager restart" at that point hangs, leaving the machine in a state where I can't open a new gnome-terminal and restarting from the menu in the panel doesn't work. In fact, although I can switch to VT1 and login as root, "shutdown -h now" from there gives repeated error messages about failing to umount /home and then hangs at "Synchronizing SCSI cache for disk sda". So I still have to use the power button to turn the machine off. So I basically confirm the information from bobsyeruncle in comment #23. It does get DHCP correctly, but still doesn't work properly.
*** This bug has been marked as a duplicate of 241783 ***