Bug 1551373
Summary: | laptop keyboard and trackpad don't work after boot to level 1, 3, or 5 but work after lid used to suspend+resume | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | aaronsloman <a.sloman> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 27 | CC: | airlied, a.sloman, bskeggs, btissoir, dchen, ewk, hdegoede, ichavero, itamar, jarodwilson, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, peter.hutterer, steved | ||||
Target Milestone: | --- | Flags: | jforbes:
needinfo?
|
||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | 4.15.10-300.fc27.x86_64 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-08-29 15:02:20 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
aaronsloman
2018-03-05 00:45:57 UTC
Just noticed typing errors after Steps to Reproduce: 3. Use likd to suspend and resume. should be 3. Use lid to suspend and resume. Also I left a spurious "Because the problem"... (From an earlier comment: Because the problem occurs on boot at levels 1, 3 and 5 it cannot be related to the window manager. I meant to delete all of that, because it is stated lower down now.) Apologies for any confusion. The following additional information may be useful in identifying the problem, or problems. I have found that although the laptop keyboard and trackpad are ignored after a full boot, they work immediately after hibernate/resume. So it looks as if some crucial items in the hibernate/resume and suspend/resume code should also be in the initial boot code?? Fixing that would still leave the "timeout" problem: input via keyboard or trackpad stops working if neither is used for about 20 minutes. External (USB) keyboard and mouse work in that case, but using them does not wake up the built in keyboard/trackpad. I wonder whether -- the timeout mechanism disabling keyboard and trackpad should be removed or -- after timeout occurs there should be detection of use of keyboard or trackpad, triggering restoration of their full functionality. I suspect the problem is the latter: i.e. a power saving mechanism turns off detection of keyboard/trackpad, but there is nothing to detect that that should be undone. If they are turned off, some interrupt mechanism should detect the need to turn them on without suspend/resume having to be invoked. (I don't write programs at that level, so I apologise if my suggestions are irrelevant.) Meanwhile, is there some way I can turn off the timeout mechanism until there is a complete fix? Should I split this into two bug reports -- one for non-functionality at boot and one for the timeout problem? (Neither problem occurs in 4.8.6-300.fc25.x86_64) === I apologise for the typos in my initial report -- working too late at night, half asleep. I've tried removing tlp completely. This made no difference to the keyboard timeout. So presumably the problem is deeper in the system. libinput isn't really used on the VT (unless you're using consolation or something), so this is most likely a kernel bug. You can verify this by running sudo evemu-record and selecting the devices that don't work. If you don't see events when you use them, then this is definitely a kernel issue. We'll need a dmesg from you for further debugging, thanks. Created attachment 1404762 [details]
dmesg output running kernel 4.8.6(f25) and kernel 4.15.6(f27)
Two files included in tar file
dmesg-4.8.6-and-4.15.6.tgz
FILE1:
dmesg-stone-4.8.6-300.fc25.x86_64
booting of kernel 4.8.6 to level 3, with keyboard working
after running startx keyboard and touchpad work
FILE2:
dmesg-stone-4.15.6-300.fc27.x86_64-startx-timeout
search for "******" for inserted comments indicating actions
booting of kernel 4.15.6 to level3, with keyboard not working
shut lid to suspend, paused, then opened: keyboard working
ran startx to enter graphic mode with ctwm window manager.
keyboard and touchpad working.
paused for about 10 minutes, found keyboard and touchpad not working
shut and opened lid as above. Both working again.
Just sent attachment (tar file with output of dmesg from 4.8.6-300.fc25.x86_64 and output of dmesg from 4.15.6-300.fc27.x86_64, the latter with comments indicating where keyboard/touchpad were and were not working. Comments flagged by asterisks and text in capitals. Now running evemu-record and waiting for keyboard to stop working. Will send results later. Please note: I am somewhat out of my depth. If I've provided something different from what's required, please let me know what I should have done. Something else that may be relevant. I sometimes get messages like this today: Message from syslogd@stone at Mar 6 09:36:32 ... kernel:Uhhuh. NMI received for unknown reason 2c on CPU 0. Message from syslogd@stone at Mar 6 09:36:32 ... kernel:Do you have a strange power saving mode enabled? Message from syslogd@stone at Mar 6 09:36:32 ... kernel:Dazed and confused, but trying to continue That happened this morning while running 4.15.6-300.fc27 before I rebooted into 4.8.6-300.fc25 to get comparative dmesg. Since I returned to 4.15.6 it has not repeated. Apologies for delay -- I've been out. Results of use of evemu-record: I booted the machine to level 3 then used 'startx' to enter graphic mode, then ran evemu-record both with option 4: = AT Translated Set 2 keyboard and option 6: = SynPS/2 Synaptics TouchPad In both cases the relevant actions (key press, touch pad motion) were reported. In both cases after the machine was left idle for a time, then tested, the actions produced no response. But suspend + resume (using laptop lid) restored the responses. This also worked when I ran evemu-record while logged in to the laptop from a linux PC using a wifi connection. It looks as if something in kernel 4.8.6-300 was lost in later kernels. Let me know if there's anything else I can do. Thanks. This morning, after resume from hibernate: Message in xterm window: Message from syslogd@stone at Mar 7 09.00.32 do_IRQ: 3.36 No irq handler for vector Also in /var/log/messages Mar 7 09:00:32 stone kernel: do_IRQ: 3.36 No irq handler for vector There are occurrences about once a day, sometimes twice, in /var/log/messages* Yet another (strange?) piece of information. This morning I was using this laptop to give a presentation followed by discussion. It was connected via VGA to display panel on the wall. I noticed after the discussion period that although I had not touched the laptop for over 30 minutes the keyboard and mousepad were both still working. In order to check this I later tried at home using an old computer monitor with VGA connector. I left it connected for over an hour, without touching the keyboard or the mousepad. After that, everything worked: no dead keyboard or mousepad. As a control, I then disconnected the monitor and left the computer untouched for about 50 minutes. After that the keyboard and mousepad did not work, until suspend+resume, using the lid. How can attaching an external display prevent keyboard and mousepad being disabled? Have just updated to kernel 4.15.8-300.fc27.x86_64. All the previously mentioned problems remain - can't login immediately after boot without doing suspend+resume, and the keyboard and trackpad stop working if neither is used for a while (20-30 minutes), but suspend resume revives both. However, I have found a 'hack' that reduces the hassle after reboot. Normally I don't reboot unless there's a new kernel or a lot of packages have been updated. Then I boot to level 3 and after any bookkeeping run startx. If there's no reason to reboot I use hibernate/resume, which works perfectly. To save time/hassle after boot I have found a temporary hack: To avoid having to shut the lid to suspend then pause for flashing light and then open the lid, I altered a startup script to run suspend before the login prompt. In /etc/rc.d/init.d/livesys-late after this line: /etc/init.d/functions I inserted: systemctl suspend So now, the initial boot process does not reach the login prompt. Instead it suspends (with suspend light flashing). I press the start switch and it resumes, gives the login prompt, and I can login normally. This is less hassle than having to use the lid to enable login. The annoying timeout problem remains, however. This makes me wonder whether the boot login problem is not in the kernel but in one of the startup scripts -- something in a startup script needs to do what resume from suspend does? But that would not explain why in Fedora 25 the boot login and idle timeout problems both started on my laptop after kernel 4.8.6-300.fc25.x86_64. All the later F25 kernels had the login problem. Going back to 4.8.6.-300 solved the problem. Updated to 4.15.10 kernel-modules-extra-4.15.10-300.fc27.x86_64 kernel-headers-4.15.10-300.fc27.x86_64 kernel-modules-4.15.10-300.fc27.x86_64 kernel-4.15.10-300.fc27.x86_64 kernel-devel-4.15.10-300.fc27.x86_64 kernel-core-4.15.10-300.fc27.x86_64 Good news: So far, after partial tests it looks as if the keyboard/mouse problems have been fixed. I removed the edit to /etc/rc.d/init.d/livesys-late mentioned in the previous comment and so far I have found: The laptop now boots to level 3 with keyboard and touchapd working, and also after hibernate. I started the ctwm window manager and left an xterm running for an hour without touching keyboard or mousepad. Afterthan both worked, without suspend/resume being required. So it looks as if the timeout problem has also been fixed. I have not yet had time to test booting to level 5. I'll try that when I have time. I've just got this old message again, shortly after hibernate+resume: Message from syslogd@stone at Mar 21 13:39:25 ... kernel:do_IRQ: 3.36 No irq handler for vector I assume that's totally related to the bug/s reported here. Many thanks to all involved in fixing the keyboard/mousepad bug(s). I won't mark this as closed, until I've checked boot to level 5 (direct to graphical login). Correction: in comment #11 I wrote: TYPO in last message > Message from syslogd@stone at Mar 21 13:39:25 ... > kernel:do_IRQ: 3.36 No irq handler for vector > > I assume that's **totally related** to the bug/s reported here. Sorry: that should have been "totally UNrelated". Written in too much haste. Anyhow I have now tried booting to level 5 (graphical mode login) and both the keyboard and the mousepad worked. So I no longer need to connect external usb mouse or keyboard to log in in graphic mode. All the faults I reported on 5th March seem to have been fixed in the latest kernel. Is 16 days a record? Is there any more I can do to help track down cause of this message, which seems to come regularly after resume from hibernate? Message from syslogd@stone at <date> <time> ... kernel:do_IRQ: 3.36 No irq handler for vector On a hunch, I tried hibernating with wifi off, and I did not get that message after resume, until I explicitly restarted wifi. Likewise turing wifi off then on manually, using nmcli, produces that message (sometimes with additions): Turn off wifi: nmcli networking off Turn it on: nmcli networking on produces: Message from syslogd@stone at Mar 21 20:56:18 ... kernel:do_IRQ: 3.36 No irq handler for vector Despite the message, the command works. Let me know if there's anything else I can do about this. The main problem has been fixed. The "No irq handler" problem is unrelated, and I'll start a new bug report for that (I've now found it on both desktop and laptop machines running F27). The "No irq handler" problem is now reported in bug #1562360 i.e. https://bugzilla.redhat.com/show_bug.cgi?id=1562360 The bug report describes different manifestations in a notebook with wireless connection and a desktop PC with ethernet connection, both running Kernel 4.15.13-300.fc27.x86_64 I've included dmesg output with relevant bits marked. *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs. Fedora 27 has now been rebased to 4.17.7-100.fc27. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28. If you experience different issues, please open a new bug report for those. *********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously. |