Description of problem: With the 3.3.4-5 kernel, I see similar problems occurring on a number of systems. Problem 1: PS/2 keyboard suddenly goes dead. I found no way to revive it other than reboot. When plugging in a USB keyboard I can continue to use this, but I also had a case where the USB keuyboard got stuck and didn't get unstuck even on disconnect / reconnect. There are absolutely no related messages in the system logs or on the console. Problem 2: The Ethenret interface goes dead for a number of seconds, then comes up again, reporting a "link up" event. In the mean time, mounted NFS file systems report errors. Example log: [78495.152019] nfs: server castor not responding, timed out [78497.960032] nfs: server castor not responding, timed out [78498.725161] r8169 0000:04:00.0: p20p1: link up [78504.536033] nfs: server castor not responding, timed out [78508.744052] nfs: server castor not responding, timed out [78510.725151] r8169 0000:04:00.0: p20p1: link up [81541.184081] nfs: server castor not responding, timed out [81543.992096] nfs: server castor not responding, timed out [81546.725160] r8169 0000:04:00.0: p20p1: link up [81800.840031] nfs: server castor not responding, timed out [81804.725167] r8169 0000:04:00.0: p20p1: link up [81805.048042] nfs: server castor not responding, timed out [81809.256033] nfs: server castor not responding, timed out [81816.725156] r8169 0000:04:00.0: p20p1: link up [81818.128045] nfs: server castor not responding, timed out [83411.928031] nfs: server castor not responding, timed out [83412.725167] r8169 0000:04:00.0: p20p1: link up [86016.048088] nfs: server castor not responding, timed out [86018.856031] nfs: server castor not responding, timed out [86021.664037] nfs: server castor not responding, timed out [86022.725159] r8169 0000:04:00.0: p20p1: link up [88090.984022] nfs: server castor not responding, timed out [88093.792022] nfs: server castor not responding, timed out [88096.600023] nfs: server castor not responding, timed out [88098.725149] r8169 0000:04:00.0: p20p1: link up [88700.192021] nfs: server castor not responding, timed out [88703.000032] nfs: server castor not responding, timed out [88704.725162] r8169 0000:04:00.0: p20p1: link up [90249.192026] nfs: server castor not responding, timed out [90252.725154] r8169 0000:04:00.0: p20p1: link up [90513.960089] nfs: server castor not responding, timed out [90516.725161] r8169 0000:04:00.0: p20p1: link up [90516.768031] nfs: server castor not responding, timed out [90519.576032] nfs: server castor not responding, timed out [90522.384031] nfs: server castor not responding, timed out [90525.192047] nfs: server castor not responding, timed out [90528.725155] r8169 0000:04:00.0: p20p1: link up [90773.960022] nfs: server castor not responding, timed out [90776.768019] nfs: server castor not responding, timed out [90779.576020] nfs: server castor not responding, timed out [90780.725149] r8169 0000:04:00.0: p20p1: link up [91276.168046] nfs: server castor not responding, timed out [91278.725164] r8169 0000:04:00.0: p20p1: link up [91281.532036] nfs: server castor not responding, timed out [91472.392032] nfs: server castor not responding, timed out [91475.200033] nfs: server castor not responding, timed out [91476.725162] r8169 0000:04:00.0: p20p1: link up [92291.184030] nfs: server castor not responding, timed out [92292.725163] r8169 0000:04:00.0: p20p1: link up [92470.312075] nfs: server castor not responding, timed out [92472.724154] r8169 0000:04:00.0: p20p1: link up [92818.032017] nfs: server castor not responding, timed out [92819.436021] nfs: server castor not responding, timed out [92820.725151] r8169 0000:04:00.0: p20p1: link up [93959.344023] nfs: server castor not responding, timed out [93962.152048] nfs: server castor not responding, timed out [93964.960071] nfs: server castor not responding, timed out [93966.725160] r8169 0000:04:00.0: p20p1: link up [94422.208030] nfs: server castor not responding, timed out [94422.725173] r8169 0000:04:00.0: p20p1: link up [94424.136022] nfs: server castor not responding, timed out [94519.592028] nfs: server castor not responding, timed out [94522.400032] nfs: server castor not responding, timed out [94524.725163] r8169 0000:04:00.0: p20p1: link up [95066.232032] nfs: server castor not responding, timed out [95069.040038] nfs: server castor not responding, timed out [95070.725177] r8169 0000:04:00.0: p20p1: link up [95071.848029] nfs: server castor not responding, timed out [95074.656026] nfs: server castor not responding, timed out [95077.464043] nfs: server castor not responding, timed out [95082.725156] r8169 0000:04:00.0: p20p1: link up [96816.056052] nfs: server castor not responding, timed out [96818.864046] nfs: server castor not responding, timed out [96821.672037] nfs: server castor not responding, timed out [96822.725161] r8169 0000:04:00.0: p20p1: link up [96824.480035] nfs: server castor not responding, timed out [96827.288044] nfs: server castor not responding, timed out [96834.072028] nfs: server castor not responding, timed out [96834.725156] r8169 0000:04:00.0: p20p1: link up [106594.288031] nfs: server castor not responding, timed out [106596.725162] r8169 0000:04:00.0: p20p1: link up [106716.072058] nfs: server castor not responding, timed out [106718.880032] nfs: server castor not responding, timed out [106721.688034] nfs: server castor not responding, timed out [106722.725160] r8169 0000:04:00.0: p20p1: link up [112116.725157] r8169 0000:04:00.0: p20p1: link up [112140.200022] nfs: server castor not responding, timed out [112143.008030] nfs: server castor not responding, timed out [112145.816031] nfs: server castor not responding, timed out [112146.725163] r8169 0000:04:00.0: p20p1: link up [112148.624031] nfs: server castor not responding, timed out [112151.432034] nfs: server castor not responding, timed out [112158.232032] nfs: server castor not responding, timed out [112158.725154] r8169 0000:04:00.0: p20p1: link up [117507.472038] nfs: server castor not responding, timed out [117510.725153] r8169 0000:04:00.0: p20p1: link up etc. On another system: Version-Release number of selected component (if applicable): kernel-3.3.4-5.fc17.x86_64 How reproducible: Th keyboard problem happened once each on 2 systems, and 5 times in 2 days so far on a third one. The network issue is more or less permanent - see log above. Steps to Reproduce: 1. boot a system with kernel-3.3.4-5.fc17.x86_64 and use it for a while Actual results: See logs above. Expected results: No problems :-) Additional info: It appears all problems go away when I downgrade to kernel version 3.3.4-5.fc17.x86_64
(In reply to comment #0) > Description of problem: > > With the 3.3.4-5 kernel, I see similar problems occurring on a number > of systems. <snip> > On another system: > > Version-Release number of selected component (if applicable): > > kernel-3.3.4-5.fc17.x86_64 <snip> > Steps to Reproduce: > 1. boot a system with kernel-3.3.4-5.fc17.x86_64 and use it for a while <snip> > Additional info: > > It appears all problems go away when I downgrade to kernel version > 3.3.4-5.fc17.x86_64 So you've told us that 3.3.4-5.fc17.x86_64 doesn't work, then you tell us it works. Confused.
(In reply to comment #1) > > So you've told us that 3.3.4-5.fc17.x86_64 doesn't work, then you tell us it > works. Confused. Argh... silly me. It is kernel-3.4.0-1.fc17.x86_64 that has the problems, and 3.3.4-5.fc17.x86_64 appears to be fine. Sorry.
Out of curiosity, can you attach the dmesg from a boot with 3.4.0-1? I'd like to see if you have an ASM108x devices in those machines.
(In reply to comment #3) > Out of curiosity, can you attach the dmesg from a boot with 3.4.0-1? I'd > like to see if you have an ASM108x devices in those machines. See attachments. I also included the lspci output.
Created attachment 592287 [details] Boot log and lspci output of system 1
Created attachment 592288 [details] Boot log and lspci output of system 2
The same problems are still present with kernel version 3.4.2-4.fc17.x86_64 I had the same keyboard lockuptwice, and the network issue stiss exists, too - now with a bit of additional information in one case: Jun 18 11:43:50 nyx kernel: [ 1738.248031] nfs: server castor not responding, timed out Jun 18 11:43:53 nyx kernel: [ 1741.712018] ------------[ cut here ]------------ Jun 18 11:43:53 nyx kernel: [ 1741.712029] WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x250/0x260() Jun 18 11:43:53 nyx kernel: [ 1741.712033] Hardware name: P35-DS3R Jun 18 11:43:53 nyx kernel: [ 1741.712036] NETDEV WATCHDOG: p20p1 (r8169): transmit queue 0 timed out Jun 18 11:43:53 nyx kernel: [ 1741.712038] Modules linked in: fuse nfs fscache ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM bridge stp llc xfs nouveau snd_hda_codec_realtek snd_hda_intel mxm_wmi wmi video snd_hda_codec i2c_algo_bit ttm snd_hwdep snd_pcm drm_kms_helper snd_page_alloc drm snd_timer snd coretemp osst st r8169 iTCO_wdt microcode iTCO_vendor_support ch i2c_i801 i2c_core mii soundcore vhost_net tun macvtap macvlan kvm_intel nfsd kvm nfs_acl auth_rpcgss lockd sunrpc uinput binfmt_misc raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor sym53c8xx ata_generic pata_acpi async_tx scsi_transport_spi pata_jmicron [last unloaded: iptable_mangle] Jun 18 11:43:53 nyx kernel: [ 1741.712114] Pid: 0, comm: swapper/0 Not tainted 3.4.2-4.fc17.x86_64 #1 Jun 18 11:43:53 nyx kernel: [ 1741.712117] Call Trace: Jun 18 11:43:53 nyx kernel: [ 1741.712119] <IRQ> [<ffffffff8105680f>] warn_slowpath_common+0x7f/0xc0 Jun 18 11:43:53 nyx kernel: [ 1741.712130] [<ffffffff81056906>] warn_slowpath_fmt+0x46/0x50 Jun 18 11:43:53 nyx kernel: [ 1741.712135] [<ffffffff8108661c>] ? ttwu_do_wakeup+0x2c/0xf0 Jun 18 11:43:53 nyx kernel: [ 1741.712140] [<ffffffff815017c0>] dev_watchdog+0x250/0x260 Jun 18 11:43:53 nyx kernel: [ 1741.712144] [<ffffffff81501570>] ? dev_deactivate_queue.constprop.30+0x80/0x80 Jun 18 11:43:53 nyx kernel: [ 1741.712150] [<ffffffff810659b1>] run_timer_softirq+0x141/0x340 Jun 18 11:43:53 nyx kernel: [ 1741.712154] [<ffffffff8105dbb0>] __do_softirq+0xc0/0x1e0 Jun 18 11:43:53 nyx kernel: [ 1741.712160] [<ffffffff815f9cdc>] call_softirq+0x1c/0x30 Jun 18 11:43:53 nyx kernel: [ 1741.712164] [<ffffffff810151f5>] do_softirq+0x75/0xb0 Jun 18 11:43:53 nyx kernel: [ 1741.712168] [<ffffffff8105df85>] irq_exit+0xb5/0xc0 Jun 18 11:43:53 nyx kernel: [ 1741.712172] [<ffffffff815fa61e>] smp_apic_timer_interrupt+0x6e/0x99 Jun 18 11:43:53 nyx kernel: [ 1741.712177] [<ffffffff815f938a>] apic_timer_interrupt+0x6a/0x70 Jun 18 11:43:53 nyx kernel: [ 1741.712179] <EOI> [<ffffffff8101bad2>] ? mwait_idle+0x92/0x1e0 Jun 18 11:43:53 nyx kernel: [ 1741.712187] [<ffffffff8101c50e>] cpu_idle+0xfe/0x120 Jun 18 11:43:53 nyx kernel: [ 1741.712191] [<ffffffff815cda5e>] rest_init+0x72/0x74 Jun 18 11:43:53 nyx kernel: [ 1741.712197] [<ffffffff81cf4c1a>] start_kernel+0x3b7/0x3c4 Jun 18 11:43:53 nyx kernel: [ 1741.712201] [<ffffffff81cf4662>] ? repair_env_string+0x5e/0x5e Jun 18 11:43:53 nyx kernel: [ 1741.712205] [<ffffffff81cf4346>] x86_64_start_reservations+0x131/0x135 Jun 18 11:43:53 nyx kernel: [ 1741.712209] [<ffffffff81cf444a>] x86_64_start_kernel+0x100/0x10f Jun 18 11:43:53 nyx kernel: [ 1741.712212] ---[ end trace 933b84f8c20a9beb ]--- Jun 18 11:43:53 nyx kernel: [ 1741.717171] r8169 0000:04:00.0: p20p1: link up Jun 18 12:00:09 nyx kernel: [ 2717.808048] nfs: server castor not responding, timed out Jun 18 12:00:11 nyx kernel: [ 2719.717164] r8169 0000:04:00.0: p20p1: link up
Eventually we should split this bug report. I just had the dead keyboard problem with the 3.3.4-5.fc17.x86_64 kernel, too. However, the network issue has never happened since with this one.
I'm fairly confused on this one. Your attachments in comments #5 and #6 show you're using the nvidia module, which can do weird things on upgrades. But comment #7 doesn't have anything tainted. There are a few known NFS issues in 3.4 that 3.4.2/3.4.3 might fix up. Aside from that, I'm not sure what the keyboard lockup issue would be and comment #7 leads me to believe something is seriously hanging the kernel up.
I have some additional information about the keyboard lockup issue: 1) It seems I always trigger the problem when I'm holding the left shift key for some extended time, typically when I'm selecting a text region in a window with the mouse for copy & paste. 2) I also see other errors when holding the shift key for a long time, for example when I'm typing a long sequence of uppercase letters: sometimes, they will start coming out lower case. For a long time I thought this was an unreliable contact in my (old) keyboard, but now I realize that this happens on 4 different keyboards, so it looks more like a software issue. 3) When the keyboard is dead, I can still log in from another system, and I can run for example evtest /dev/input/by-path/platform-i8042-serio-0-event-kbd which shows that the keyboard is still generating normal input events. So the bug must be in somewhat higher layers.
The network issue is still present with 3.4.4-3.fc17.x86_64 , see attached boot log.
Created attachment 595892 [details] Boot log with 3.4.4-3 kernel
I've experienced the same (keyboard) problem (Kernel 3.4). The first time, it happened with a PS/2 keyboard. I then connected a different PS/2 keyboard, but that didn't change anything (this is a mainboard that allows PS/2 hotplugging). I connected a USB keyboard and it worked. The next day, the same thing happened - with the new USB keyboard. I didn't have the time to try another keyboard, I just rebooted to make it work again. This happened all of the sudden, while I was working on the computer. The num lock light was still on. Almost no key press worked anymore. Except backspace and some few other keys every 10th time or so (but this might very well be random).
(In reply to comment #13) > The next day, the same thing happened - with the new USB keyboard. I didn't > have the time to try another keyboard, I just rebooted to make it work again. With a USB keyboard, I was usually able to recover simply by unplugging and re-plugging the USB keyboard. If this didn't work on first try, I plugged it into another USB port. I had only a single case since where nothing helped and I really had to reboot. Yes, this is a major PITA!
Happened again with USB keyboard, disconnecting and reconnecting it helped. On 3.4.4-3.fc17.x86_64.
Same with 3.4.4-5.fc17.x86_64...
As for Wolfgangs #2 problem: I have something similar which seems to be fixed by switching off the receive-checksum offload option of the NIC. Test with: ethtool -K p2p1 rx off I think it might be related with BZ#635596 from RHEL (which I'm not allowed to read). Kernel 3.4.6-2.fc17.x86_64 still suffers from it.
Same issue here. FC17 3.4.6-2.fc17.x86_64 Attached messages cut and lspci. Note: My Mb has 2 NIC adapters, but I only use 1... I can't remember why. I guess it has something to do with iPXE, since I use this system over iSCSI. Interesting enough, NetworkManager seems to try and renew IP every 2 hours ( in the attachment too ). All the content of the attached messages took place while the system was idle, I was out, not using it. I got a lockup yesterday. I'm noticing some issues with my server too. CentOS6, where even firefox managed to cause a kernel crash! The video was out, but I managed to shut down by pressing the power button. At the same time, the desktop ( FC17 ) locked up. I have this desktop system overclocked, but it used to work well with FC14. Let me know if you need any more data.
Created attachment 600338 [details] Cut of the relevant part of messages Kernel crash report for 3.4.6-2.fc17.x86_64
Created attachment 600339 [details] My LSPCI LSPCI of my system: Mobo: Gigabyte GA-EP45-UD3P with one NIC disabled.
On 3.4.4-5.fc17.x86_64: It happened again and I've just realized that most of the seem to keys work if I hold them for 1-2 seconds. If I hit num lock, nothing happens (light stays on), but if I hold it for about 2 seconds, then release it, the light switches. At this point, I'm not sure, if this behavior was always like this (I don't think so).
Both problems are still present with 3.5.0-2.fc17.x86_64
At least the keyboard problem is still present with 3.7.4-204.fc18.x86_64
Turns out it's a feature, not a bug. It's called "Slow keys". Someone must've thought it would be great to have a feature, that'll randomly disable the keyboard (if keys only work after holding them for like 2 seconds then that counts as "disabling" too). Unfortunately it seems to be turned on by default when using GDM (which is what I use). $ xkbset q | grep "Accessibility Features" Accessibility Features (AccessX) = On Holding the Shift key down for 10 seconds will activate this "feature". gnome-control-center - Universal Access - Typing - Slow Keys is OFF. See bug #816764: https://bugzilla.redhat.com/show_bug.cgi?id=816764 I going to change the display manager and be done. This is not a Kernel bug and whatever's left of this bug report does not affect me. So I'm outta here.
(In reply to comment #24) > Turns out it's a feature, not a bug. It's called "Slow keys". Someone > must've thought it would be great to have a feature, that'll randomly > disable the keyboard (if keys only work after holding them for like 2 > seconds then that counts as "disabling" too). Unfortunately it seems to be > turned on by default when using GDM (which is what I use). Thanks for pointing out - you are right. What a PITA!!