Created attachment 661000 [details] gzipped tar file containing output of dmesg, /var/log/messages from last boot, listing of hardware on machine, output of "rpm -qa" command Description of problem: ---- NOTE - This machine uses the rpmfusion nvidia drivers. Removing them and going to the Fedora-supplied drivers does not change anything below - first one and then both ethernet cards will become irreversibly disabled. ---- This machine is a workstation connected both directly to the internet (no proxy) and to a local (192.168.1.xxx) lan - it has 2 network cards. When the machine is rebooted, both cards work "normally" and behave as expected the cards are eth0 - Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01) eth1 - D-Link System Inc DGE-560T PCI Express Gigabit Ethernet Adapter (rev 13) Depending on system activity, after a few minutes and always after half an hour or so, eth0 - the realtek - card is suddenly disabled and I have no access to local printers, other machines, etc. NetworkManager grays out the "enable/disable" buttons for BOTH cards, and the card cannot be re-enabled. The problem is that if I put my previous FC16 drive back in the system instead of the FC17 drive, then both cards work perfectly and there no problems. The first time this happened, I assumed that the Realtek module on the motherboard had failed and I bought the Realtek pciexpress card which is being disabled here (the nic on the MB still works fine but only under FC16 - but FWIW it is never enabled under FC17 at all - hence the new card). A few hours or a day or so later, eth1 joins it as disabled, and the machine has no access to anything. Period. (I tried a usbnic "borrowed" from another machine, and it, too, was diabled eventually, so I suspect a problem either in the kernel, network manager, or other networking component(s). So far the only "WORKAROUND" I've found is to reboot the machine, and then the problem comes back again every time. The weird part here is that at some point (a few kernels back, but I don't know which kernel - sorry) the cards were not disabled that I know of, but that could be merely because I was working for 2 weeks on a critical project which required coding/compiling/debugging but no network access... Version-Release number of selected component (if applicable): The "obvious" culprit is NetworkManager; however, I'm not positive, so I'm giving more info than requested. Kernel: kernel-3.6.9-2.fc17.x86_64 (has happened with 3.6.8-2 and 3.6.7-4 as well, but before that not sure). NetworkManager How reproducible: Happens every time Steps to Reproduce: 1.Boot the machine 2.Log in. Then do a telnet, ssh, rcp, etc. to another machine or print to the printer. 3.Wait a few minutes - the Realtek card (eth0) will have been disabled. 4.Wait a few hours and the D-Link card (eth1) will also have been disabled. Actual results: First eth0 and then eventually (a few hours up to say 24 hours later) eth1 will also be disabled Expected results: Should never be disabled. Additional info:
The logs you've attached doesn't reveal any obvious problem. However, messages log seems quite truncated. Would you attach more complete log, so we can see initializations after boot and from NetworkManager start? Also, please paste output of: $ nmcli dev status What is output of: $ ip a Also, try disabling or relabelling SELinux, there are bunch of error messages.
I'll try to be brief. Sorry to have taken so long to respond - the holidays messed up my scehedule beyond belief. To make a long story short, I continued to read several different threads about problems which sounded similar to mine on FC17. A couple of other people described the same problem there, and their issue was that the r8169 driver for realtek nic's. Here's the output from lspci -vv -b for the NIC on the motherboard: 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B \ PCI Express Gigabit Ethernet controller (rev 03) Subsystem: Giga-byte Technology GA-EP45-DS5 Motherboard Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- \ Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- \ <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 5 Region 0: I/O ports at be00 Region 2: Memory at fd3ff000 (64-bit, prefetchable) Region 4: Memory at fd3f8000 (64-bit, prefetchable) Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,\ D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0200c Data: 4162 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s \ <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- \ Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ \ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, \ Latency L0 <512ns, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- \ CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ \ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- \ SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [ac] MSI-X: Enable- Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [cc] Vital Product Data Unknown small resource type 00, will not decode more. Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- \ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- \ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- \ RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 03-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Originally I had assumed that the nic on my MB had gone bad. As it happened the pcie card I bought to "replace" it used another Realtek chip and the same driver. After reading everything I could find, I finally decided to wipe out my installation (after doing an rpm -qa > /tmp/rpm-qa.save) and then restore everything from scratch. When the system booted I STILL couldn't see the Realtek card (the pcie card). So then following a suggestion that I someone made on a H/W discussion list, I removed the newer chip and went back to the original nic on my MB (re-enabling it that is). It now works just fine. I don't know whether they filed bug reports or not, but several other folks had reported having what sounds like the problem: that is, at some point one of the fc17 kernels had an r8169 driver that stopped the card from working and for a couple of kernels from even being recognized. A very recent kernel fixed the problem - for the older NIC on my MB anyway. So I suspect that if there is an open Bugzilla issue about the "realtek problem," this is just another manifestation of it and you should go ahead and mark it as such. Otherwise, I'm back up and running now, so this is no longer an issue for me. Strangely after everything was "working" again, I DID try disabling the nic on the MB and trying the pcie card again (which had worked at one point with the kernel which introduced the problem in the first place - and sorry, I'm no longer sure which one that was). But it still didn't work. If I still had the card I'd update my bug report accordingly, but I swapped it with someone for a different card (using a different chip) and I don't have the ability to do any "testing" with it at this point - sorry.
(In reply to comment #2) > I'll try to be brief. > > Strangely after everything was "working" again, I DID try disabling the nic > on the MB and trying the pcie card again (which had worked at one point > with the kernel which introduced the problem in the first place - and > sorry, I'm no longer sure which one that was). But it still didn't work. > If I still had the card I'd update my bug report accordingly, but I swapped > it with someone for a different card (using a different chip) and I don't > have the ability to do any "testing" with it at this point - sorry. Well, as the problem was a Realtek driver I will move the bug to kernel, but close it for now. Should you have problems in the future with the chip, consider reopening and attach any relevant data. Thanks.