Description of problem: When using DQ965GF (965Q chipset motherboard) applications exhibit regular "stutters" of many ms. The simplest way to see this is to playback video. Version-Release number of selected component (if applicable): RHEL5 kernels through 2.6.18-8.1.6 RHEL5-rt kernel kernel-rt-2.6.21-23.el5rt (significantly less pronounced) F7 kernels through 2.6.21-1.3228.fc7 How reproducible: Always Steps to Reproduce: 1. Boot system 2. Play video, or run glxgears Actual results: Video stutters once every 5-10 seconds. Expected results: Clean playback like on old system Additional info: If you service network stop and rmmod e1000 things clear up. Interestingly enough modprobing and restarting network does not seem to have the issue. I have not confirmed this with extensive testing (it worked once, could be red herring, etc). If this is the case, it might be an issue with driver load order.
Created attachment 157229 [details] dmidecode
Created attachment 157230 [details] lsmod
Created attachment 157231 [details] lspci -v -v
Attached dmidecode, lsmod, and lspci from the system. (Note - it is running F7 kernel right now)
Rod, Anything show up in the logs like watchdog timeouts or anything like that? What about disabling TSO? See any differences when turning it off? I as mostly because you are seeing bursty traffic and since TSO does stuff in chunks it *could* be a culprit. Normally I'm not sure I'd even suggest that anymore since TSO seems pretty stable, but the jitter makes me wonder. You could also try my rhel5 test kernels, but it's unlikely they will make a difference since f7 seems hosed as well. http://people.redhat.com/agospoda/#rhel5 One last suggestion...are you running NetworkManager on this system by any chance? It looks like there may be some sort of workaround for the 82566 that could cause some delays, but those delays will really only happen if doing a bunch of ethtool operations. NetworkManager might be a good candidate for someone calling the ethtool ioctl a bunch. Here's the function I'm talking about: /****************************************************************************** * Work-around for 82566 Kumeran PCS lock loss: * On link status change (i.e. PCI reset, speed change) and link is up and * speed is gigabit- * 0) if workaround is optionally disabled do nothing * 1) wait 1ms for Kumeran link to come up * 2) check Kumeran Diagnostic register PCS lock loss bit * 3) if not set the link is locked (all is good), otherwise... * 4) reset the PHY * 5) repeat up to 10 times * Note: this is only called for IGP3 copper when speed is 1gb. * * hw - struct containing variables accessed by shared code ******************************************************************************/ static int32_t e1000_kumeran_lock_loss_workaround(struct e1000_hw *hw)
NetworkManager is off, and there is hardly any ethernet traffic. Like an idle ssh session. Again, the issue is that video is jittery, regularly pausing for a few ms every few seconds. Not network traffic. Example - I can boot the system and it exhibits the problem with just about no network traffic whatsoever. I remove and reinsert the e1000 module and the problem is gone, even if I am scping 2 gigabyte files to another machine. This issue can be seen using intel driver for the built in graphics (and via the pci express ADD2 SDVO port) but it also is found when using a cheap PCI video card using the nv driver. Any quick way to monitor pci resets or activity? (e.g. let's say the issue is related to the problem that NM tries to workaround in the above function, not the solution)
So I have seen the issue go away without removing the module. Interestingly enough, I am thinking that it is related to speed.... here is an annoted log excerpt: Jun 23 16:51:47 tv2 kernel: e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection Jun 23 16:51:47 tv2 kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None **system has come up at gigabit** **video issues observed** **service network stop** Jun 23 16:58:42 tv2 kernel: e1000: eth0: e1000_reset: Hardware Error **video issues go away** **service network start** Jun 23 16:59:13 tv2 kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX Jun 23 16:59:13 tv2 kernel: e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO **Hmm it didn't come up at 1000** **ethtool -s eth0 speed 1000** Jun 23 17:03:17 tv2 kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Jun 23 17:03:20 tv2 kernel: e1000: eth0: e1000_watchdog: NIC Link is Down Jun 23 17:03:23 tv2 kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX Jun 23 17:03:23 tv2 kernel: e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO Several more times I proceeded to try to set 1000 and the link automatically went down to 100.
also, remove and probing the e1000 module does not change the fact that it can't do gigabit. So the problem seems that: e1000 can't get into gigabit mode except at boot when in gigabit mode from boot, latency is seen
Rod, It seems that your system has TSO disabled when linked up at 100Mbps. Can you reproduce the problem and then try # ethtool -K eth0 tso off and see if you still have the extra jitter in your video stream? You can verify that TSO is off by doing a # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: off Thanks!
The system no longer negotiates gigabit on boot, it just ends up at 100. Thus I can't reproduce the issue anymore.
Rod, Since I'm guessing that your device should never have negotiated to 1Gbps anyway (since it now only does 100Mbps), I'm going to close this as resolved in 5.1. Please reopen if needed.