From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4 Description of problem: eth 0 hangs on bootup. I have to physically remove the ethernet cable just for 2 seconds and bootup resumes. Mouse and screen are frozen completely. Version-Release number of selected component (if applicable): 2.6-21-1.3194. How reproducible: Always Steps to Reproduce: 1.Boot and it freezes 2.unplug ethernet cable and replug in 3.boots and runs perfectly Actual Results: Its frozen does nothing Expected Results: Additional info: Sorry this bug is for F7 current release, i386. 00:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) It ran perfectly on FC6.
Happens here on my laptop, if the cable is plugged in I have to remove it, if it is unplugged during boot I have to plug it in to unfreeze the boot process. If I disable the Network Manager services it boots normally whether the cable is plug in of unplugged
I get this also on a Gateway Desktop with Linksys Gigabit card. If I do interactive boot, skip network init, and then manually start the card, works OK.
I just updated my last fc6 machine to f7, and it has the bug as well. So I have got 2 machines out of 4 that do this. Its the same ethernet card Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet. I even got inventive and moved one realtek from a good server to one of these bad servers, it still failed and the good one still was okay. Also tried shuffling the PCI cards around to change the order, still the same. Seems its going to be some issue based on motherboard chipset / cpu / bios / etc against certain brands / models of ethernet adapters.
problem idem , d-link DGE-528T gigabit ethernet adaptater Starting F7 ( no such issue in FC6 and previous releases),my modem being connected, or extinct, I obtain systematic blocking at the stage of “starting networking” ( bringing interface eth: 0), only basic solution is to extinguish and relight the modem; if the modem is extinct I relight it simply: then I get the following line - > determining IP information for eth 0… done, it is not ideal, is it a real bug or a personnal bad network configuration? Sorry for my english language
I have an r8169 as eth1, and tulip as eth0. Booting kernel-2.6.21-1.3194.fc7 (686 arch) setting eth1 to start on boot the system hangs at "Determining IP information for eth1" during boot, but if I set eth0 to start at boot, and eth1 to not start at boot, then the system boots normally. If I boot kernel-xen-2.6.20-2925.9.fc7 (686 arch) the hang does not occur starting the r6189 interface at boot.
Bugs 242301 and 242357 appear to be duplicates of this.
I read them and it looks the same. I filed this bug way before those. I disagree its a driver bug as stated in 242357, as just read above different brands are affected. My guess its to do with the ARP test when the device fires up. It so reminds me of a bug that surfaced in FC6 a while ago on a policy update, that basically caused eth to fail and had to be restarted manually after boot up. However I consider this bug is a high priority, as if you happen to be effected by this bug, your server won't fire up without physically removing and replugging in the offending eth card. If the server is on a remote location that is very bad.
I don't know if you are getting the same error, but here is a dump of the log: Jun 2 20:13:52 localhost kernel: r8169: eth0: link up Jun 2 20:14:10 localhost kernel: r8169: eth0: link up Jun 2 20:14:10 localhost kernel: BUG: soft lockup detected on CPU#0! Jun 2 20:14:10 localhost kernel: [<c0451f3e>] softlockup_tick+0xa5/0xb4 Jun 2 20:14:10 localhost kernel: [<c042e930>] update_process_times+0x3b/0x5e Jun 2 20:14:10 localhost kernel: [<c043d2bd>] tick_sched_timer+0x78/0xbb Jun 2 20:14:10 localhost kernel: [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6 Jun 2 20:14:10 localhost kernel: [<c043d245>] tick_sched_timer+0x0/0xbb Jun 2 20:14:10 localhost kernel: [<c0408534>] timer_interrupt+0x2c/0x32 Jun 2 20:14:10 localhost kernel: [<c04521aa>] handle_IRQ_event+0x1a/0x3f Jun 2 20:14:10 localhost kernel: [<c04535ea>] handle_level_irq+0x81/0xc7 Jun 2 20:14:10 localhost kernel: [<c04072c7>] do_IRQ+0xb8/0xd1 Jun 2 20:14:10 localhost kernel: [<c04058ff>] common_interrupt+0x23/0x28 Jun 2 20:14:10 localhost kernel: [<c04058ff>] common_interrupt+0x23/0x28 Jun 2 20:14:10 localhost kernel: [<c0561704>] yenta_interrupt+0x13/0xb4 Jun 2 20:14:10 localhost kernel: [<c04521aa>] handle_IRQ_event+0x1a/0x3f Jun 2 20:14:10 localhost kernel: [<c04535ea>] handle_level_irq+0x81/0xc7 Jun 2 20:14:10 localhost kernel: [<c0453569>] handle_level_irq+0x0/0xc7 Jun 2 20:14:10 localhost kernel: [<c04072bb>] do_IRQ+0xac/0xd1 Jun 2 20:14:10 localhost kernel: [<c04058ff>] common_interrupt+0x23/0x28 Jun 2 20:14:10 localhost kernel: [<c042b2dc>] __do_softirq+0x54/0xba Jun 2 20:14:10 localhost kernel: [<c04071b7>] do_softirq+0x59/0xb1 Jun 2 20:14:10 localhost kernel: [<c0453569>] handle_level_irq+0x0/0xc7 Jun 2 20:14:10 localhost kernel: [<c042b194>] irq_exit+0x38/0x6b Jun 2 20:14:10 localhost kernel: [<c04072cc>] do_IRQ+0xbd/0xd1 Jun 2 20:14:10 localhost kernel: [<c04058ff>] common_interrupt+0x23/0x28 Jun 2 20:14:10 localhost kernel: [<f8b0007b>] rtl8169_init_one+0x5c7/0x9d7 [r8169] Jun 2 20:14:10 localhost kernel: [<c060171d>] _spin_unlock_irqrestore+0x8/0x9 Jun 2 20:14:10 localhost kernel: [<f8aff1f7>] rtl8169_open+0x139/0x194 [r8169] Jun 2 20:14:10 localhost kernel: [<c05a2f8d>] dev_open+0x2b/0x62 Jun 2 20:14:10 localhost kernel: [<c05a19e1>] dev_change_flags+0x47/0xe4 Jun 2 20:14:10 localhost kernel: [<c05a977b>] rtnl_setlink+0x264/0x365 Jun 2 20:14:10 localhost kernel: [<c05a9517>] rtnl_setlink+0x0/0x365 Jun 2 20:14:10 localhost kernel: [<c05a8dad>] rtnetlink_rcv_msg+0x1c1/0x1e6 Jun 2 20:14:10 localhost kernel: [<c05b4e19>] netlink_run_queue+0x50/0xbe Jun 2 20:14:10 localhost kernel: [<c05a8bec>] rtnetlink_rcv_msg+0x0/0x1e6 Jun 2 20:14:10 localhost kernel: [<c05a8bab>] rtnetlink_rcv+0x25/0x3d Jun 2 20:14:10 localhost kernel: [<c05b51b6>] netlink_data_ready+0x12/0x4c Jun 2 20:14:10 localhost kernel: [<c05b426a>] netlink_sendskb+0x19/0x30 Jun 2 20:14:10 localhost kernel: [<c05b5198>] netlink_sendmsg+0x277/0x283 Jun 2 20:14:10 localhost kernel: [<c0599180>] sock_sendmsg+0xd0/0xeb Jun 2 20:14:10 localhost kernel: [<c0436e71>] autoremove_wake_function+0x0/0x35 Jun 2 20:14:10 localhost kernel: [<c0436e71>] autoremove_wake_function+0x0/0x35 Jun 2 20:14:10 localhost kernel: [<c04e7100>] copy_from_user+0x3a/0x66 Jun 2 20:14:10 localhost kernel: [<c059932d>] sys_sendmsg+0x192/0x1f7 Jun 2 20:14:10 localhost kernel: [<c0599e0d>] sys_recvmsg+0x1b9/0x1cd Jun 2 20:14:10 localhost kernel: [<c04e7350>] copy_to_user+0x3c/0x50 Jun 2 20:14:10 localhost kernel: [<c0599c3c>] move_addr_to_user+0x50/0x68 Jun 2 20:14:13 localhost kernel: [<c059a0d6>] sys_getsockname+0x9f/0xb0 Jun 2 20:14:13 localhost kernel: [<c06016f4>] _spin_lock_bh+0x8/0x18 Jun 2 20:14:13 localhost kernel: [<c059adb6>] release_sock+0x12/0x9d Jun 2 20:14:13 localhost kernel: [<c059a4fc>] sys_socketcall+0x240/0x261 Jun 2 20:14:13 localhost kernel: [<c0404f70>] syscall_call+0x7/0xb Jun 2 20:14:13 localhost kernel: ======================= Jun 2 20:14:13 localhost kernel: r8169: eth0: link down It makes no difference if it is set for Static or DHCP, like mentioned a hard lock until either the cable is removed or cable is plugged in.
Same issue here ... using the r816 driver .. 00:0b.0 Ethernet controller: Linksys Gigabit Network Adapter (rev 10) Subsystem: Linksys EG1032 v3 Instant Gigabit Network Adapter Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 11 I/O ports at e800 [size=256] Memory at df000000 (32-bit, non-prefetchable) [size=256] [virtual] Expansion ROM at 20000000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2
Any progress on this bug? Its a very early bug # against F7 and its important as if the server is rebooted or suffers a power failure it will not start without physical intervention.
OK, I just found a little bit more. When booting Fedora 7, If I select "I" for interactive mode, when it gets to the line: Start Service Network Y/N (C)ontinue I select Yes and error happens FATAL: Module not found (Next Line) Bringing up loopback interface (OK) FATAL: Module not found But the computer boots without plugging the ethernet caable in...
But again starting with the cable plugged in, I go to "I"nteractive mode, and it get to the line: Start NetworkManagerDispatcher and it hangs until I disconnect the cable, it will probably get past this part if I shut off the Service for Dispatcher
I found a driver for the 8169 chipset on the Realtek website that was released May 23, but when I try to compile it, I get a few errors. You can find the driver here, http://www.realtek.com.tw/downloads/downloadsView.aspx?Langid=1&PNid=13&PFid=4&Level=5&Conn=4&DownTypeID=3&GetDown=false&Downloads=true#5,7,8,10,982 Can someone try to install it?
There is no point trying this, as its not just realtek that has this issue. Its something else besides the actual driver. Its something in Fedora that does an ARP test on bootup / starting. Therefore rebuilding the driver is pointless exercise, as with all these issues some people run the same card and have no issues, others do.
I changed this to kernel the bug is one of the first filed against F7 and its still new :( I think its arpwatch, its something in the ethernet startup Its got to be fixed, its really urgent as you cant restart your server unless you are physically there and unplug/replug the ethernet cable
David, They might be focused on this one as it is related.... http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242572
Yes its similar, except they are totally focused on the realtek. Other cards are effected. Its also not the kernel. I forced on the fc6 kernel and it still does it. Its network manager or arpwatch.
I changed this to kernel. After a bunch of package updates, it was still hanging on boot. However I forced on the fc6 kernel and the machine boots floorlessly and no eth0 hang up. Therefore its the f7 kernel.
Any update? Can someone at redhat at least assign this bug? I know it requires a new kernel to be tested.
Its no longer an issue with the latest f7 kernel, at least for me.
Nope its still broken. I have tried a few kernels from updates-testing and no change at all. I still have two machines doing this. Note you need to be careful, I have found that intermittently the two machines I have doing this boot up, but this is regardless of any F7 kernel version.
Reproduced this on an installed system. Looks like the trick might be a 32-bit UP system with an smp kernel (the default f7 kernel is smp). A UP kernel might die too, but we don't build one of those anymore, so I can't say if that makes a difference. I did notice that it doesn't *always* hang while booting, but it seems to happen most of the time. Seeing soft-lockups like that makes me wonder if some code was added recently that works well on true SMP systems but not on UP ones. Bug 242572 seems to be a dup of this....
Hello folks, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug and will try and assist you in resolving it if I can. There hasn't been much activity on this bug for a while. Could you tell me if you are still having problems with the latest kernel? The bug mentioned in #22 looks resolved and indeed appears a dupe. If the problem has gone away then please close this bug or I'll do so in a few days if there is no additional information lodged. Cheers Chris
Chris, Yes it was fixed and did not ever resurface on Fedora 7. I also never have seen it in Fedora 8.