Bug 2040183 - Kernel 5.14 and 5.15 unable to boot on AWS EC2 i3.large instance type
Summary: Kernel 5.14 and 5.15 unable to boot on AWS EC2 i3.large instance type
Keywords:
Status: CLOSED DUPLICATE of bug 2010058
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 34
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-13 08:24 UTC by Orange Kao
Modified: 2022-02-03 02:40 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-02-03 02:40:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg from kernel version 5.15.13-100 (50.75 KB, text/plain)
2022-01-13 08:24 UTC, Orange Kao
no flags Details

Description Orange Kao 2022-01-13 08:24:23 UTC
Created attachment 1850518 [details]
dmesg from kernel version 5.15.13-100

1. Please describe the problem:

Latest kernel 5.15.13-100.fc34 won't boot on Amazon EC2 instance type i3.large

  nvme nvme0: pci function 000:00:1e.0
  nvme nvme0: 15/0/0 default/read/poll queues
  Generating "/run/initramfs/rdsosreport.txt"

  OR

  nvme nvme0: pci function 000:00:1e.0
  nvme nvme0: I/O 16 QID 0 timeout, completion polled
  nvme nvme0: I/O 17 QID 0 timeout, completion polled
  nvme nvme0: I/O 18 QID 0 timeout, completion polled
  nvme nvme0: I/O 19 QID 0 timeout, completion polled
  nvme nvme0: I/O 12 QID 0 timeout, completion polled
  nvme nvme0: I/O 16 QID 0 timeout, completion polled
  nvme nvme0: I/O 13 QID 0 timeout, completion polled
  nvme nvme0: I/O 14 QID 0 timeout, completion polled
  nvme nvme0: I/O 15 QID 0 timeout, completion polled
  nvme nvme0: I/O 12 QID 0 timeout, completion polled
  nvme nvme0: I/O 13 QID 0 timeout, completion polled
  nvme nvme0: I/O 14 QID 0 timeout, completion polled
  nvme nvme0: I/O 17 QID 0 timeout, completion polled
  (repeat similar messages forever, or boot into login prompt (rare))

2. What is the Version-Release number of the kernel:

5.15.13-100

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Kernel 5.13.19-200.fc34 is not affected by this issue

Issue first appear on 5.14.9-200.fc34


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

  1. Start AWS EC2 instance type i3.large
     using AMI ami-0627bcdb0bea81d1b (Sydney)
       (Fedora-Cloud-Base-34-1.2.x86_64-hvm-ap-southeast-2-gp2-0)
     or AMI ami-09e08e82e8f927ba4 (N. Virginia)
       (Fedora-Cloud-Base-34-1.2.x86_64-hvm-us-east-1-gp2-0)
  2. sudo dnf -y install kernel kernel-core kernel-headers kernel-modules
  3. sudo shutdown -r now


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

This problem occur on latest rawhide kernel (kernel-core 5.16.0-60.fc36)

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

File "i3-large-new-kernel" as attached. It contains nvme and ena related
error messages.


Note: This seems unrelated to bug #2010058 because I tried adding modify
/usr/lib/dracut/modules.d/90kernel-modules/module-setup.sh and running
"dracut", does not work.

Comment 1 Orange Kao 2022-02-03 02:40:51 UTC
Sorry, "dracut" config patch do work. I made a mistake when I try the dracut workaround.
This is a duplicate of bug #2010058

Detail: I ran git bisect and found commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3b62c140e93d32c825ed028faca45dee58dbe37f
introduced the issue for EC2 i3 instance family.

*** This bug has been marked as a duplicate of bug 2010058 ***


Note You need to log in before you can comment on or make changes to this bug.