Bug 1921924 - Raspberry Pi 3B+ boots incredibly slow after CPU failed to come online errors
Summary: Raspberry Pi 3B+ boots incredibly slow after CPU failed to come online errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: uboot-tools
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Robinson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker AcceptedFreezeExcepti...
Depends On:
Blocks: ARMTracker F34BetaFreezeException
TreeView+ depends on / blocked
 
Reported: 2021-01-28 21:17 UTC by Brandon Nielsen
Modified: 2021-03-23 21:12 UTC (History)
30 users (show)

Fixed In Version: uboot-tools-2021.04-0.5.rc3.fc34
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-19 20:15:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl output after first boot (408.85 KB, text/plain)
2021-02-03 20:35 UTC, Brandon Nielsen
no flags Details

Description Brandon Nielsen 2021-01-28 21:17:38 UTC
Description of problem: System boot is incredibly slow (continuing for well over  an hour) with many services failing to start. At the start of boot "CPU1: failed to come online", "CPU2: failed to come online" and "CPU3: failed to come online" appears in the console.


Steps to Reproduce:

1. Write Fedora Workstation Rawhide image to SD card using "sudo arm-image-installer --image=/home/nielsenb/Desktop/Fedora-Workstation-Rawhide-20210128.n.0.aarch64.raw.xz --target=rpi3 --media=/dev/sdb --resizefs"
2. Insert SD card into Raspberry Pi 3b+
3. Watch boot process


Actual results:

Boot takes over an hour. "CPU1: failed to come online", "CPU2: failed to come online", "CPU3: failed to come online" are displayed very early in the boot process. Only other logs appear to show services failing to boot. Eventually system boots but is unusably slow.


Expected results:

System boots to the desktop in a reasonable amount of time, and is usable.


Additional info:

Using the 20210128 compose[0]. Assigned to kernel because I don't know where else to put it, please reassign appropriately if this is wrong. I have found this possibly related upstream issue pointing to firmware[1]. If I manage to capture logs, I will attach them here.

[0] - https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20210128.n.0/compose/Workstation/aarch64/images/Fedora-Workstation-Rawhide-20210128.n.0.aarch64.raw.xz
[1] - https://github.com/Hexxeh/rpi-firmware/issues/232

Comment 1 Brandon Nielsen 2021-02-03 20:34:37 UTC
20210203 compose[0] boots successfully without any service failed to start errors. CPU failed to come online errors persist, and only one processor appears after boot. I have attached the output of 'journalctl -b' after the initial boot.

[0] - https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20210203.n.0/compose/Workstation/aarch64/images/Fedora-Workstation-Rawhide-20210203.n.0.aarch64.raw.xz

Comment 2 Brandon Nielsen 2021-02-03 20:35:18 UTC
Created attachment 1754875 [details]
journalctl output after first boot

Comment 4 Tom Lane 2021-02-05 21:40:41 UTC
I'm seeing approximately the same symptoms on a Raspberry PI 3B+: very slow boot, CPUs 1 through 3 fail to come online.  Eventually (after 5 minutes or so) it does get to the textual system setup dialog (locale/network/passwords/etc), but I've not been able to get through that because keyboard input is too flaky.

This is with Fedora-Server-Rawhide-20210205.n.0.aarch64.raw.xz install image.

A week or so ago I tried Fedora-Server-Rawhide-20210128.n.0.aarch64.raw.xz
and then Fedora-Minimal-Rawhide-20210127.n.1.aarch64.raw.xz with no better results.

Current Fedora 33 is fine on the same hardware.

Comment 6 Brandon Nielsen 2021-02-23 00:55:14 UTC
I can confirm I cannot reproduce with Fedora 33, even with the latest kernel.

Comment 7 Brandon Nielsen 2021-02-24 21:10:34 UTC
I am also not able to reproduce by using dnf system-upgrade to upgrade to Fedora 34. The upgraded system works as expected and does not exhibit this bug.

It must be something with how the compose is generated, so kernel is almost certainly not the correct component.

Comment 8 Peter Robinson 2021-03-02 16:16:16 UTC
(In reply to Brandon Nielsen from comment #3)
> armhfp 20210203 compose[0] seems to hang at "starting kernel".

That was a different problem and should be now resolved.

Comment 9 Brandon Nielsen 2021-03-04 00:32:41 UTC
(In reply to Peter Robinson from comment #8)
> (In reply to Brandon Nielsen from comment #3)
> > armhfp 20210203 compose[0] seems to hang at "starting kernel".
> 
> That was a different problem and should be now resolved.

I can confirm the 20210302.n.1 armhfp compose[0] boots successfully.

The same compose for aarch64[1] shows the CPU failed to come online error described in the bug.

[0] - https://kojipkgs.fedoraproject.org/compose/branched/Fedora-34-20210302.n.1/compose/Workstation/armhfp/images/Fedora-Workstation-34-20210302.n.1.armhfp.raw.xz
[1] - https://kojipkgs.fedoraproject.org/compose/branched/Fedora-34-20210302.n.1/compose/Workstation/aarch64/images/Fedora-Workstation-34-20210302.n.1.aarch64.raw.xz

Comment 10 Tom Lane 2021-03-06 22:15:08 UTC
I can confirm that the upgrade-F33-to-F34 path does not show this problem (it has other ones though :-().  In testing that, I have had successful boots with kernel versions 5.11.0-156.fc34.aarch64 and 5.11.3-300.fc34.aarch64.  Meanwhile, the last composed image I tried, Fedora-Server-34-20210303.n.0.aarch64.raw.xz, still had the issue.

Hence, I agree with comment #7 that this is not a kernel bug per se.  It does not seem to be a grub2 problem either (I had a successful boot with 1:2.04-38.fc34), which leaves me at a loss to guess which component to blame next.

Comment 11 Brandon Nielsen 2021-03-11 01:42:04 UTC
Observed with both Workstation Beta 1.1[0] and IoT 20210308[1] composes. 

The workstation compose never successfully booted to initial setup.

The IoT compose did not exhibit the slow boot behavior, implying that may be a separate issue, but still only one CPU came online.

[0] - https://kojipkgs.fedoraproject.org/compose/34/Fedora-34-20210310.0/compose/Workstation/aarch64/images/Fedora-Workstation-34_Beta-1.1.aarch64.raw.xz
[1] - https://kojipkgs.fedoraproject.org/compose/iot/Fedora-IoT-34-20210308.0/compose/IoT/aarch64/images/Fedora-IoT-34-20210308.0.aarch64.raw.xz

Comment 12 nzfrne+ej4q0pzgt6m54 2021-03-11 11:35:11 UTC
the issue is with uboot
i'm using a raspberry pi 3b+ with fedora server rawhide and after noticing this and playing around with what kernels i was using i tried going back to u-boot 2020.10 and now instead of dmesg logs telling me about cores not coming online i get:

[    0.009482] smp: Bringing up secondary CPUs ...
[    0.010620] Detected VIPT I-cache on CPU1
[    0.010721] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[    0.011964] Detected VIPT I-cache on CPU2
[    0.012036] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[    0.013265] Detected VIPT I-cache on CPU3
[    0.013337] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[    0.013587] smp: Brought up 1 node, 4 CPUs
[    0.013608] SMP: Total of 4 processors activated.
[    0.013618] CPU features: detected: 32-bit EL0 Support
[    0.013629] CPU features: detected: CRC32 instructions
[    0.013639] CPU features: detected: 32-bit EL1 Support
[    0.013836] CPU features: emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching

Comment 13 Peter Robinson 2021-03-11 13:20:50 UTC
(In reply to nzfrne+ej4q0pzgt6m54 from comment #12)
> the issue is with uboot

Good catch! With a quick test it's a regression somewhere between 2020.10 and 2021.01.

Comment 14 Peter Robinson 2021-03-12 11:15:03 UTC
Bisected it down to upstream commit 4cbb2930bd8c (efi_loader: consider no-map property of reserved memory), I've reached out to upstream.

Comment 15 Peter Robinson 2021-03-14 19:29:52 UTC
This was posted upstream and also appears to fix the problem:
https://lists.denx.de/pipermail/u-boot/2021-March/444343.html

Comment 16 Fedora Update System 2021-03-14 22:12:16 UTC
FEDORA-2021-84510c6f65 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-84510c6f65

Comment 17 Fedora Blocker Bugs Application 2021-03-14 22:13:57 UTC
Proposed as a Blocker for 34-beta by Fedora user pbrobinson using the blocker tracking app because:

 Fixes a regression where the Raspberry Pi 3 series of devices only boots with a single processor, this fixes the issue so it boots with all 4 cores.

Comment 18 Geoffrey Marr 2021-03-15 20:52:35 UTC
Discussed during the 2021-03-15 blocker review meeting: [0]

The decision to classify this bug as a "RejectedBlocker (Beta)" and an "AcceptedFreezeException (Beta)" was made as the system does boot eventually this doesn't seem to break the criteria, and there's the precedent of the g-i-s issue we didn't block on either. But accepted as an FE as it's obvious highly inconvenient and can't be fixed entirely with an update.

[0] https://meetbot.fedoraproject.org/fedora-blocker-review/2021-03-15/f34-blocker-review.2021-03-15-16.00.txt

Comment 19 Fedora Update System 2021-03-16 14:42:22 UTC
FEDORA-2021-84510c6f65 has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-84510c6f65`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-84510c6f65

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 20 Brandon Nielsen 2021-03-18 19:23:25 UTC
As mentioned on Bodhi, I'm not seeing this as fixed on my Pi 3B+ with FEDORA-2021-84510c6f65, is there something more I have to do than "sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-84510c6f65"?

Comment 21 Paul Whalen 2021-03-18 19:39:38 UTC
(In reply to Brandon Nielsen from comment #20)
> As mentioned on Bodhi, I'm not seeing this as fixed on my Pi 3B+ with
> FEDORA-2021-84510c6f65, is there something more I have to do than "sudo dnf
> upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-84510c6f65"?

After installing run 'rpi-uboot-update' to copy it into place.

Comment 22 Brandon Nielsen 2021-03-18 19:44:46 UTC
(In reply to Paul Whalen from comment #21)
> (In reply to Brandon Nielsen from comment #20)
> > As mentioned on Bodhi, I'm not seeing this as fixed on my Pi 3B+ with
> > FEDORA-2021-84510c6f65, is there something more I have to do than "sudo dnf
> > upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-84510c6f65"?
> 
> After installing run 'rpi-uboot-update' to copy it into place.

Thanks, doing all of the above I see the issue as fixed.

Comment 23 Fedora Update System 2021-03-19 20:15:31 UTC
FEDORA-2021-84510c6f65 has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 24 Adam Williamson 2021-03-23 20:58:53 UTC
I think this isn't fixed in Beta release images, so gonna document it in CommonBugs.


Note You need to log in before you can comment on or make changes to this bug.