Bug 2264560 - 6.7 kernels crash on Raspberry Pi 4B a minute after boot
Summary: 6.7 kernels crash on Raspberry Pi 4B a minute after boot
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 39
Hardware: aarch64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-02-16 16:01 UTC by Chris Adams
Modified: 2024-11-04 01:30 UTC (History)
17 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-11-04 01:30:25 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
kernel 6.8.0.rc4 boot log (46.88 KB, text/plain)
2024-02-16 16:02 UTC, Chris Adams
no flags Details

Description Chris Adams 2024-02-16 16:01:22 UTC
1. Please describe the problem:
After installing Fedora kernels 6.7.3 or 6.7.4 on a Raspberry Pi 4B and rebooting, the system boots but then hangs about a minute after boot. I tried disabling all my local services (chrony+gpsd for GPS time, NUT for UPS, weewx for weather, watchdog) and rebooting but that had no effect. I also enabled network console, but got nothing. I hooked up a monitor (this Pi is normally headless) and saw it was back at a U-Boot> prompt:

Net:   Abort
eth0: ethernet@7d580000
PCIe BRCM: link up, 5.0 Gbps x1 (SSC)
starting USB...
Bus xhci_pci: Register 5000420 NbrPorts 5
Starting the controller
USB XHCI 1.00
scanning bus xhci_pci for devices... 6 USB Device(s) found
       scanning usb for storage devices... 1 Storage Device(s) found
U-Boot>


2. What is the Version-Release number of the kernel:
kernel-6.7.4-200.fc39.aarch64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
kernel-6.7.3-200.fc39.aarch64

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
Boot a 6.7.x kernel, wait a minute

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Yes, tried kernel-6.8.0-0.rc4.20240215git8d3dea210042.38.fc41.aarch64

6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.
Looks like the crash leaves the journal corrupted, but will attach the netconsole output from my syslog server.

Reproducible: Always

Comment 1 Chris Adams 2024-02-16 16:02:14 UTC
Created attachment 2017228 [details]
kernel 6.8.0.rc4 boot log

Comment 2 Chris Adams 2024-04-17 23:27:23 UTC
Still happening with 6.8.6-200.fc39... system reboots, comes up for maybe 10-20 seconds, then dies.

This Pi has a GPS HAT with PPS, with the pps-gpio overlay in config.txt (which in turn gets the pps_gpio kernel module loaded).  The length of time "up" after boot may be about the amount of time the GPS takes to reset and started sending data on serial and PPS on the GPIO pin.

Comment 3 Peter Robinson 2024-04-29 12:50:45 UTC
> also enabled network console, but got nothing. I hooked up a monitor (this
> Pi is normally headless) and saw it was back at a U-Boot> prompt:

What kernel output do you get when the kernel starts to boot? How far does the kernel get when it starts to boot before it resets. What version of U-Boot?

> 2. What is the Version-Release number of the kernel:
> kernel-6.7.4-200.fc39.aarch64

I've not seen issues with RPi4 with any 6.7/6.8 kernels.

> Looks like the crash leaves the journal corrupted, but will attach the
> netconsole output from my syslog server.

It doesn't show anything after the journald line. What happens at that time?

One thought is that over time with supporting the various RPi devices these sort of problems are often PSU related, what sort of PSU do you have. The U-Boot output mentions 6 USB devices, what are they, are they connected through a powered hub or how are they connected.

Comment 4 Chris Adams 2024-04-30 00:56:46 UTC
It will boot up just long enough for me to SSH in before it dies. I hooked a monitor back up and there's no error messages or anything... it boots to the login prompt, and then after a few seconds, it dumps back to U-Boot. There's nothing else in netconsole or local dmesg (when I can SSH in and run it before it crashes).

U-Boot is uboot-images-armv8-2023.07-3.fc39.noarch. I tried 6.9.0-0.rc5.20240426gitc942a0cd3603.48.fc41.aarch64 from rawhide and got the same result.

It's the official Raspberry Pi 4 USB-C power supply. I wouldn't think that it'd work 100% reliable with kernel 6.6 but fail 100% of the time with a newer kernel.

The USB devices:

- a weather station (with its own power supply)
- an UPS (for monitoring)
- a Logitech unifying receiver (that I forgot I left plugged in, not used)
- an SSD (used for the root FS)

Comment 5 Peter Robinson 2024-04-30 08:50:16 UTC
> It's the official Raspberry Pi 4 USB-C power supply. I wouldn't think that
> it'd work 100% reliable with kernel 6.6 but fail 100% of the time with a
> newer kernel.

You wouldn't think, but we have seen exactly this in the past with RPi, they're non standard USB power supplies (Even the USB-C on the RPi5 doesn't use USB-PD but it's own invention) and we've seen issues in the past with major kernel bumps that have enabled certain things (like 3D GPU), where power draw goes up and some ransom PSU suddenly isn't good enough, hence the question.

The official one should be fine, but just a sudden reset on boot is something local because else we would have seen it in the rest of the 6.8 testing for F-40 GA. Can you try booting with nothing but the SSD attached?

Comment 6 Peter Robinson 2024-04-30 08:51:30 UTC
> U-Boot is uboot-images-armv8-2023.07-3.fc39.noarch.

What does "sudo dmesg | grep DMI" report?

Comment 7 Chris Adams 2024-04-30 14:15:09 UTC
I have another official Pi 4 power supply (that's barely been used), I'll swap it out to see if that makes a difference.

I did try commenting out the PPS overlay in config.txt to see if that made a difference; it did not.

Also, the use case I had for the SSD didn't end up happening, so I'll try to switch it back to a uSD card (but I think I'll have to pick up a new card for that).

DMI:
[    0.034908] DMI: raspberrypi,4-model-b Raspberry Pi 4 Model B Rev 1.5/Raspberry Pi 4 Model B Rev 1.5, BIOS 2023.07 07/01/2023

Comment 8 Chris Adams 2024-05-01 00:10:48 UTC
Okay, moved from the external SSD back to just a uSD card. Disconnected all USB. Used the other Pi 4 power supply. Made sure all my services (so everything except for systemd stuff and NetworkManager) were disabled. I also put the default config.txt in place. Still crashed. The only thing I did not remove was the GPS HAT (because it is kind of difficult to get in and out).

One thing - with a monitor on it, I could see that U-Boot actually rebooted it. I used grub2-reboot to set the 6.9.0-rc5 kernel just for one boot, so it successfully booted back to 6.6.13.

Comment 9 Chris Adams 2024-11-04 01:30:25 UTC
System is working on a card with a clean F41 install... no idea what the problem was.


Note You need to log in before you can comment on or make changes to this bug.