Bug 2229708 - Bootup has half an hour delay on Kernel 6.4.x on Tuxedo InifityBook S15 Gen7
Summary: Bootup has half an hour delay on Kernel 6.4.x on Tuxedo InifityBook S15 Gen7
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 38
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-07 11:26 UTC by xspielinbox+redhat
Modified: 2023-08-13 18:06 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)
Kernel Log (105.27 KB, text/plain)
2023-08-07 11:28 UTC, xspielinbox+redhat
no flags Details

Description xspielinbox+redhat 2023-08-07 11:26:20 UTC
1. Please describe the problem:
Whenever I boot up a 6.4.x Kernel on my TUXEDO TUXEDO InfinityBook S 15/17 Gen7 (Firmware Version: 1.07.09RTR2, Memory: 64.0 GiB, Processor: 12th Gen Intel® Core™ i7-1260P × 16, Graphics: Intel® Graphics (ADL GT2), Disk Capacity: 4.0 TB) for about 20 to 30min after the grub menu the Laptop does not respond to anything except a hard reset, but after that time then suddenly the password prompt for the disk encryption shows up.
During that time sometimes the screen is completely black, sometimes it shows an underscore in the upper left corner, often it just shows the Manufacturer Logo.
Once the PC finally has booted up, one cannot notice any problems any more.

2. What is the Version-Release number of the kernel:
6.4.4-200.fc38.x86_64
6.4.6-200.fc38.x86_64
6.4.7-200.fc38.x86_64
At least all of these are affected.

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?
When switching back to Kernel version 6.3.12 everything works fine and expected as it did all the months before. So Kernel 6.4.4 was the first one affected.
Note: Kernel 6.4.0 to 6.4.3 were never offered for install via dnf to me.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
- install Kernel 6.4.X
- reboot and do not switch to older kernel in Grub

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Yes, it occurs with 6.5.0-0.rc4.20230804gitc1a515d3c027.33.fc39.x86_64 too.

6. Are you running any modules that not shipped with directly Fedora's kernel?:
None, that I would be aware of.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.
Done

Reproducible: Always

Comment 1 xspielinbox+redhat 2023-08-07 11:28:04 UTC
Created attachment 1982109 [details]
Kernel Log

Comment 2 xspielinbox+redhat 2023-08-12 05:44:11 UTC
The problem persists with Kernel 6.4.8-200.fc38.x86_64 and 6.4.9-200.fc38.x86_64.

Comment 3 Hans de Goede 2023-08-12 09:42:33 UTC
Have you tried contacting Tuxedo about this?

Running Linux is a supported use case on their laptops and Fedora's kernel is pretty close to the upstream kernel, so I would expect them to be able to reproduce this.

In my experience Tuxedo usually is pretty good in resolving Linux support issues for their laptops.

Comment 4 Hans de Goede 2023-08-12 21:25:41 UTC
I just noticed this commit which specifically fixes a problem on the TUXEDO InfinityBook S 15/17 Gen7 has been queued for 6.4.11 (which is not yet released):

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-6.4.y&id=5e45622210dd8322c7ebafc44da655f6c096beb4

So I suspect that this bug will be fixed by 6.4.11 once released and picked up by Fedora in a couple of days.

Comment 5 xspielinbox+redhat 2023-08-12 22:06:31 UTC
Thank you for the reply!

Yes, this commit does seem like something that would fix my problem, though I haven't tried it yet.

I did contact Tuxedo in the meantime now, but they will then probably just tell me the same then.

But I am still confused how this regression was introduced in the first place. I can't find any change regarding IRQ in the last month in https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/drivers/char/tpm/tpm_tis.c?h=linux-6.4.y

I unfortunately don't understand the kernel code, but I assume, that https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/drivers/char/tpm/tpm_tis.c?h=linux-6.4.y&id=77218e83c83c1cd4b994edfb5b162ece42e73ffe must have somehow then introduced the regression?

During my initial testing and research I did try some kernel cli flags, whether they resolve my issue, but interestingly enough as far as I can remember tpm_tis.interrupts=0 did not solve my issue, though it is presented as a effective workaround in the referenced SUSE bug.

What are the advantages/disadvantages of IRQ and polling anyway? Wouldn't it be a better solution, if the firmware bug that everyone in the threads is talking about get's solved instead of the kernel disabling some feature?

Comment 6 xspielinbox+redhat 2023-08-13 09:33:14 UTC
Whatever I did the last time: I tried it again and adding the tpm_tis.interrupts=0 kernel boot option does indeed work around the issue at least with Kernel 6.4.9.

I also noticed that reboots do not seem to be affected at least when booting from kernel 6.4.9 to kernel 6.4.9.

Comment 7 Hans de Goede 2023-08-13 12:29:30 UTC
> I did contact Tuxedo in the meantime now, but they will then probably just tell me the same then.

I'm not sure if they are aware of this fix since it was send by someone from Suse, not by someone from Tuxedo. So it would be good if you can send them another mail pointing to the fix so that they won't waste time on it.

> I unfortunately don't understand the kernel code, but I assume, that https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/drivers/char/tpm/tpm_tis.c?h=linux-6.4.y&id=77218e83c83c1cd4b994edfb5b162ece42e73ffe must have somehow then introduced the regression?

No this commit which in 6.4.y and not in 6.3.y causes this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/char/tpm?h=v6.4&id=e644b2f498d297a928efcb7ff6f900c27f8b788e

> What are the advantages/disadvantages of IRQ and polling anyway?

polling means asking the TPM repeatedly if the last command is ready, which consumes CPU. IRQ means waiting for the TPM to signal the CPU that things are ready, allowing the CPU to do other work in the mean time.

So using an IRQ usually is preferred over polling. But typically the TPM is only used a couple of times during boot and it does not see much use after that. So using polling instead of IRQ based access is not a big deal.

And because of the ever growing list of devices with issues with using the IRQ, tpm_tis is actually going to move to polling by default:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/drivers/char/tpm/tpm_tis.c?h=linux-6.4.y&id=44d3baca8bcda0856fc20fb58ae4bd254d952580

Comment 8 xspielinbox+redhat 2023-08-13 18:06:30 UTC
Thank you for the good explanation!

I notified Tuxedo.


Note You need to log in before you can comment on or make changes to this bug.