Bug 1551373 - laptop keyboard and trackpad don't work after boot to level 1, 3, or 5 but work after lid used to suspend+resume [NEEDINFO]
Summary: laptop keyboard and trackpad don't work after boot to level 1, 3, or 5 but wo...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-05 00:45 UTC by aaronsloman
Modified: 2018-08-29 15:02 UTC (History)
20 users (show)

Fixed In Version: 4.15.10-300.fc27.x86_64
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-29 15:02:20 UTC
Type: ---
Embargoed:
jforbes: needinfo?


Attachments (Terms of Use)
dmesg output running kernel 4.8.6(f25) and kernel 4.15.6(f27) (34.16 KB, text/plain)
2018-03-06 10:51 UTC, aaronsloman
no flags Details

Description aaronsloman 2018-03-05 00:45:57 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build Identifier: 

I am reporting this as connected with libinput, though I don't have proof that the fault is in that package. It could be in a deeper part of the kernel.

In addition to the fault in the summary there's also a time-related fault, summarised below.

MAIN symptom:
laptop keyboard and trackpad don't work after boot to level 1, 3, or 5 (i.e. this is not graphics related).

But they do then work after the laptop lid is used to suspend+resume --  though I have to wait for disc-activity lights to stop after shutting lid, and again after opening it.

When the laptop keyboard and trackpad don't work, external keyboard and mouse (connected via usb) do work.

However after builtin keyboard and trackpad are unused for over about 20 minutes they stop working, even if the laptop is in active use via ssh. Again suspend+resume gets them working again.

NOTE: this bug is not present in F25 kernel 4.8.6-300.fc25.x86_6 the last kernel that worked for me!

In all subsequent kernels in F25 and F27 I had this sort of bug (laptop keyboard and mousepad not working, and temporarily fixed by suspend+resume using lid).

I have setup F25 root and F27 root in different partitions and can boot to either. If I boot to F25 using kernel 4.8.6, none of these symptoms is present. However screen handling is inferior, e.g. screen 'tearing' and some text distorted. Screen handling seems to have been fixed in the current F27 kernel, otherwise I would go back to F25 and kernel 4.8.6-300


Reproducible: Always

Steps to Reproduce:
1. Use grub.cfg or bootmenu edit to boot into level 1, or 3, or 5
2. Try to login using laptop keyboard: it fails.
3. Use likd to suspend and resume. keyboard works and login succeeds.

After logging in at level 3 or 5 leave the machine unused for about 20-30 minutes, or use it only via ssh from another machine. The laptop keyboard and trackpad then stop working and can only be fixed (temporarily) using suspend and resume.
Because the problem 
Actual Results:  
Frozen keyboard and trackpad input.

Expected Results:  
Input should work correctly on keyboard and trackpad, as it did in kerneel 4.8.6-300

It also worked during the installation process using a live USB stick, to install Fedora 27 + XFCE (I was also using XFCE on F25.

But I use the CTWM window manager instead of XFCE window manager. The window manager can't be relevant as the problem arises if I boot to level 1 or level 3.

The laptop is a Stonebook Mini, which is a re-badged Clevo W515LU
Original hardware configuration (april 2016):

11.6", 1.2Kg, 4 core Celeron N3160, 8GB Ram, 500GB SSD, 1366 x 768 matte screen,
VGA, hdmi, 1x USB 3, 2xUSB 2, SD card reader, wifi, ethernet replaceable battery. 

Software: linux: Fedora 25/27, and Windows 10 home for occasional use.

Use with windows shows none of the symptoms, so it doesn't seem to be a hardware fault. (Also  because it works perfectly with F25 kernale 4.8.6-300)

The WD Hard drive was later replaced with Samsung 500GB SSD 850 EVO Series 2.5.

Details of hardware components are in the  online service manual here:
http://sualaptop365.edu.vn/threads/clevo-w510lu-w515lu-service-manual.1956/

The trackpad seems to be synaptics, recorded in /var/log/messages as
input: psmouse serio2: synaptics: Touchpad model: 1, fw: 7.2, id: 0x1c0b1, caps: 0xd04731/0xa40000/0xa0000/0x0, board id: 0, fw id: 582762

input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio2/input/input10

This is the only mention of input devices in /var/log/messages after suspend+resume. There's no mention of waking up devices.

The *most* annoying feature in all this is the timeout that happens if I don't use the laptop keyboard or trackpad for a time, even if I am logged in to the machine remotely.

While searching for related bugs I found many bug reports related to laptop keyboard/mousepad -- this seems to be serious weakpoint in linux. However, some of the bugs were very different, e.g. reports of laptop keyboard or mouse NOT working after suspend resume, whereas in my case it's the reverse, they START working.

Comment 1 aaronsloman 2018-03-05 01:00:26 UTC
Just noticed typing errors after Steps to Reproduce:

  3. Use likd to suspend and resume.

should be

  3. Use lid to suspend and resume.

Also I left a spurious "Because the problem"... (From an earlier comment: Because the problem occurs on boot at levels 1, 3 and 5 it cannot be related to the window manager. I meant to delete all of that, because it is stated lower down now.) 

Apologies for any confusion.

Comment 2 aaronsloman 2018-03-05 13:57:39 UTC
The following additional information may be useful in identifying the problem, or problems.

I have found that although the laptop keyboard and trackpad are ignored after a full boot, they work immediately after hibernate/resume.

So it looks as if some crucial items in the hibernate/resume and suspend/resume code should also be in the initial boot code??

Fixing that would still leave the "timeout" problem: input via keyboard or trackpad stops working if neither is used for about 20 minutes.

External (USB) keyboard and mouse work in that case, but using them does not wake up the built in keyboard/trackpad.

I wonder whether 
-- the timeout mechanism disabling keyboard and trackpad should be removed
or
--  after timeout occurs there should be detection of use of keyboard or trackpad, triggering restoration of their full functionality.

I suspect the problem is the latter: i.e. a power saving mechanism turns off detection of keyboard/trackpad, but there is nothing to detect that that should be undone. If they are turned off, some interrupt mechanism should detect the need to turn them on without suspend/resume having to be invoked.

(I don't write programs at that level, so I apologise if my suggestions are irrelevant.)

Meanwhile, is there some way I can turn off the timeout mechanism until there is a complete fix?

Should I split this into two bug reports -- one for non-functionality at boot and one for the timeout problem? (Neither problem occurs in 4.8.6-300.fc25.x86_64)

===
I apologise for the typos in my initial report -- working too late at night, half asleep.

Comment 3 aaronsloman 2018-03-05 16:05:41 UTC
I've tried removing tlp completely. This made no difference to the keyboard timeout. So presumably the problem is deeper in the system.

Comment 4 Peter Hutterer 2018-03-06 05:19:16 UTC
libinput isn't really used on the VT (unless you're using consolation or something), so this is most likely a kernel bug. You can verify this by running sudo evemu-record and selecting the devices that don't work. If you don't see events when you use them, then this is definitely a kernel issue. We'll need a dmesg from you for further debugging, thanks.

Comment 5 aaronsloman 2018-03-06 10:51:24 UTC
Created attachment 1404762 [details]
dmesg output running kernel 4.8.6(f25) and kernel 4.15.6(f27)

Two files included in tar file 
   dmesg-4.8.6-and-4.15.6.tgz

FILE1: 
dmesg-stone-4.8.6-300.fc25.x86_64
booting of kernel 4.8.6 to level 3, with keyboard working
after running startx keyboard and touchpad work

FILE2:
dmesg-stone-4.15.6-300.fc27.x86_64-startx-timeout
search for "******" for inserted comments indicating actions

booting of kernel 4.15.6 to level3, with keyboard not working
shut lid to suspend, paused, then opened: keyboard working

ran startx to enter graphic mode with ctwm window manager.
keyboard and touchpad working.
paused for about 10 minutes, found keyboard and touchpad not working
shut and opened lid as above. Both working again.

Comment 6 aaronsloman 2018-03-06 11:06:54 UTC
Just sent attachment (tar file with output of dmesg from 4.8.6-300.fc25.x86_64
and output of dmesg from 4.15.6-300.fc27.x86_64, the latter with comments indicating where keyboard/touchpad were and were not working. Comments flagged by asterisks and text in capitals.

Now running evemu-record and waiting for keyboard to stop working. Will send results later.

Please note: I am somewhat out of my depth. If I've provided something different from what's required, please let me know what I should have done.

Something else that may be relevant. I sometimes get messages like this today:

Message from syslogd@stone at Mar  6 09:36:32 ...
 kernel:Uhhuh. NMI received for unknown reason 2c on CPU 0.

Message from syslogd@stone at Mar  6 09:36:32 ...
 kernel:Do you have a strange power saving mode enabled?

Message from syslogd@stone at Mar  6 09:36:32 ...
 kernel:Dazed and confused, but trying to continue

That happened this morning while running 4.15.6-300.fc27 before I rebooted into 4.8.6-300.fc25 to get comparative dmesg. Since I returned to 4.15.6 it has not repeated.

Comment 7 aaronsloman 2018-03-06 20:29:12 UTC
Apologies for delay -- I've been out.

Results of use of evemu-record:
I booted the machine to level 3 then used 'startx' to enter graphic mode, then
ran evemu-record both with
option 4:
   = AT Translated Set 2 keyboard
and option 6:    
   = SynPS/2 Synaptics TouchPad

In both cases the relevant actions (key press, touch pad motion) were reported.

In both cases after the machine was left idle for a time, then tested, the actions produced no response. But suspend + resume (using laptop lid) restored the responses.

This also worked when I ran evemu-record while logged in to the laptop from a linux PC using a wifi connection.

It looks as if something in kernel 4.8.6-300 was lost in later kernels.

Let me know if there's anything else I can do.

Thanks.

Comment 8 aaronsloman 2018-03-07 09:27:42 UTC
This morning, after resume from hibernate:
Message in xterm window:

Message from syslogd@stone at Mar  7 09.00.32
do_IRQ: 3.36 No irq handler for vector

Also in /var/log/messages
Mar  7 09:00:32 stone kernel: do_IRQ: 3.36 No irq handler for vector

There are occurrences about once a day, sometimes twice, in 

   /var/log/messages*

Comment 9 aaronsloman 2018-03-07 23:43:48 UTC
Yet another (strange?) piece of information. This morning I was using this laptop to give a presentation followed by discussion. It was connected via VGA to display panel on the wall. I noticed after the discussion period that although I had not touched the laptop for over 30 minutes the keyboard and mousepad were both still working.

In order to check this I later tried at home using an old computer monitor with VGA connector. I left it connected for over an hour, without touching the keyboard or the mousepad. After that, everything worked: no dead keyboard or mousepad.

As a control, I then disconnected the monitor and left the computer untouched for about 50 minutes. After that the keyboard and mousepad did not work, until suspend+resume, using the lid.

How can attaching an external display prevent keyboard and mousepad being disabled?

Comment 10 aaronsloman 2018-03-14 22:12:41 UTC
Have just updated to kernel 4.15.8-300.fc27.x86_64. All the previously mentioned problems remain - can't login immediately after boot without doing suspend+resume, and the keyboard and trackpad stop working if neither is used for a while (20-30 minutes), but suspend resume revives both.

However, I have found a 'hack' that reduces the hassle after reboot.

Normally I don't reboot unless there's a new kernel or a lot of packages have been updated. Then I boot to level 3 and after any bookkeeping run startx. If there's no reason to reboot I use hibernate/resume, which works perfectly.

To save time/hassle after boot I have found a temporary hack:

To avoid having to shut the lid to suspend then pause for flashing light and then  open the lid, I altered a startup script to run suspend before the login prompt. 

In /etc/rc.d/init.d/livesys-late

after this line: 

   /etc/init.d/functions

I inserted:

   systemctl suspend

So now, the initial boot process does not reach the login prompt. Instead it suspends (with suspend light flashing). I press the start switch and it resumes, gives the login prompt, and I can login normally. This is less hassle than having to use the lid to enable login.

The annoying timeout problem remains, however.

This makes me wonder whether the boot login problem is not in the kernel but in one of the startup scripts -- something in a startup script needs to do what resume from suspend does?

But that would not explain why in Fedora 25 the boot login and idle timeout problems both started on my laptop after kernel 4.8.6-300.fc25.x86_64. All the later F25 kernels had the login problem. Going back to 4.8.6.-300 solved the problem.

Comment 11 aaronsloman 2018-03-21 13:49:44 UTC
Updated to 4.15.10

kernel-modules-extra-4.15.10-300.fc27.x86_64
kernel-headers-4.15.10-300.fc27.x86_64
kernel-modules-4.15.10-300.fc27.x86_64
kernel-4.15.10-300.fc27.x86_64
kernel-devel-4.15.10-300.fc27.x86_64
kernel-core-4.15.10-300.fc27.x86_64

Good news:

So far, after partial tests it looks as if the keyboard/mouse problems have been fixed. I removed the edit to /etc/rc.d/init.d/livesys-late mentioned in the previous comment and so far I have found:

The laptop now boots to level 3 with keyboard and touchapd working, and also after hibernate.

I started the ctwm window manager and left an xterm running for an hour without touching keyboard or mousepad. Afterthan both worked, without suspend/resume being required.

So it looks as if the timeout problem has also been fixed. I have not yet had time to test booting to level 5. I'll try that when I have time.

I've just got this old message again, shortly after hibernate+resume:

 Message from syslogd@stone at Mar 21 13:39:25 ...
 kernel:do_IRQ: 3.36 No irq handler for vector

I assume that's totally related to the bug/s reported here.

Many thanks to all involved in fixing the keyboard/mousepad bug(s).
I won't mark this as closed, until I've checked boot to level 5 (direct to graphical login).

Comment 12 aaronsloman 2018-03-21 21:47:14 UTC
Correction: in comment #11

I wrote:

TYPO in last message
>  Message from syslogd@stone at Mar 21 13:39:25 ...
>  kernel:do_IRQ: 3.36 No irq handler for vector
> 
> I assume that's **totally related** to the bug/s reported here.

Sorry: that should have been "totally UNrelated". Written in too much haste.

Anyhow I have now tried booting to level 5 (graphical mode login) and both the keyboard and the mousepad worked. So I no longer need to connect external usb mouse or keyboard to log in in graphic mode.

All the faults I reported on 5th March seem to have been fixed in the latest kernel. Is 16 days a record?

Is there any more I can do to help track down cause of this message, which seems to come regularly after resume from hibernate?
 
  Message from syslogd@stone at <date> <time> ...
    kernel:do_IRQ: 3.36 No irq handler for vector

On a hunch, I tried hibernating with wifi off, and I did not get that message after resume, until I explicitly restarted wifi.

Likewise turing wifi off then on manually, using nmcli, produces that message (sometimes with additions):

Turn off wifi:
    nmcli networking off

Turn it on:
    nmcli networking on

produces:
    Message from syslogd@stone at Mar 21 20:56:18 ...
    kernel:do_IRQ: 3.36 No irq handler for vector

Despite the message, the command works.

Let me know if there's anything else I can do about this.

Comment 13 aaronsloman 2018-03-30 09:17:44 UTC
The main problem has been fixed. The "No irq handler" problem is unrelated, and I'll start a new bug report for that (I've now found it on both desktop and laptop machines running F27).

Comment 14 aaronsloman 2018-04-01 17:28:30 UTC
The "No irq handler" problem is now reported in bug #1562360 i.e.
https://bugzilla.redhat.com/show_bug.cgi?id=1562360

The bug report describes different manifestations in a notebook with wireless connection and a desktop PC with ethernet connection, both running Kernel 
4.15.13-300.fc27.x86_64

I've included dmesg output with relevant bits marked.

Comment 15 Justin M. Forbes 2018-07-23 15:05:51 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.

Fedora 27 has now been rebased to 4.17.7-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28.

If you experience different issues, please open a new bug report for those.

Comment 16 Justin M. Forbes 2018-08-29 15:02:20 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.