Created attachment 1714633 [details] F32 kernel 5.8.7 fails to boot with LUKS LVM home Created attachment 1714633 [details] F32 kernel 5.8.7 fails to boot with LUKS LVM home 1. Please describe the problem: Fedora 32 Xfce worked very well on Macbook Pro 13' (Early 2015), until the kernel upgrade from 5.8.4 to 5.8.6 and 5.8.7, which cause the LUKS encrypted LVM /home volume could no longer be mounted. 2. What is the Version-Release number of the kernel: Fedora release 32 (Thirty Two) 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : kernel-5.8.6-201.fc32 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: - Install a new Fedora 32 minimal OS - Custom partition, encrypt LVM with LUKS - Finish installation and reboot - Select kernel-5.8.7-200.fc32 and enter the passphase - OS hangs at "A start job is running for /dev/mapper/fedora_localhost_--live-home (<time> / no limit)" (please see the attachment) 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: Sorry, I didn't try it. 6. Are you running any modules that not shipped with directly Fedora's kernel?: Nope 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag.
Still not working with kernel-5.8.9-200.fc32.x86_64
- this error persists after a system reinstall with both F32 & F33 - 5.8.6 works from the live USB but not from the installed system - it is independent of the root FS (ext4/xfs) - independent of whether /home gets mounted or not Discussion on Ask Fedora https://ask.fedoraproject.org/t/f32-kernel-5-8-9-fails-to-boot-on-macbook-pro/9299 I have tried following steps: 1. I created a F33 Xfce Live USB and booted with it successfully. As a external storage, LUKS can be decrypted. Meanwhile, LVM root(xfs) and LVM home(ext4) both can be mounted (rw). So far so good. Then I started the f33 (pre-release) installation procedure – reformatted the LVM root with xfs, kept the LVM home(ext4). It was a perfect installation, after which I tried booting with f33 (kernel 5.8.6). Unfortunately, the issue was still there. See the journal here: https://paste.centos.org/view/b88fed31 f33-xfs-boot-failed 2. I repeated the f33 installation, but reformatted the LVM root with ext4, kept the LVM home(ext4). The failure seemed to be same. See the journal here: https://paste.centos.org/view/f28de1e3 3. At last, I reinstalled f32 again, and reformatted the LVM root with ext4, kept the LVM home(ext4). f32 (kernel 5.6.6) works very well. 4. I made an update, sudo dnf up -y , Then, rebooted to kernel 5.8.10, it hung there again. See the journal here: https://paste.centos.org/view/00428ad5
Hi Pany, Looks like the paste bin link you offered already expired. Will you please re-collect those log and upload to bugzilla as attachments?
Created attachment 1716753 [details] f32 kernel 5.8.10 with ext4 root fs fails to boot
Created attachment 1717170 [details] f32 kernel 5.8.11 with ext4 root fs fails to boot
The screenshot shows XFS metadata and IO errors. None of this is found in the text log files attached, they just stop after systemd journal is flushed. It sounds like the problem XFS file system was reformatted so evidence of the original problem maybe gone. In the text logs attached, I see no errors. Root has been successfully mounted, swap is activated, but nothing else is mounted yet. Since sysroot is read-write and the journal is being flushed - later events should also have made it into the journal. This can be retrieved using '-b' flag and choosing a prior boot: -1 for the previous boot, -2 for the boot before that, etc. It might be worth trying to boot with 'systemd.debug-shell=1' boot parameter, and seeing if you can switch to tty9. In this case you could just: # journalctl -b > journal-fail.txt # reboot And then boot a working kernel, and attach the failure journal.
In the failing case, there is this message: Sep 28 09:27:08 fedora.local systemd-cryptsetup[470]: WARNING: Locking directory /run/cryptsetup is missing! Do you see it in the successful case? The dm-crypt device is unlocked, and system root mounts.
Does the problem start with kernel 5.8.5 or 5.8.6? 5.8.5 for fc32 https://koji.fedoraproject.org/koji/buildinfo?buildID=1601937 You'll need these three, make sure to grab the correct arch: kernel-5.8.5-200.fc32.x86_64.rpm (info) (download) kernel-core-5.8.5-200.fc32.x86_64.rpm (info) (download) kernel-modules-5.8.5-200.fc32.x86_64.rpm (info) (download)
I guess the problem with the kernel hypothesis is that you're able to boot Fedora 33 pre-release media, which has been using kernel 5.8.6 for a while. Can you try a recent nightly? You don't have to reinstall, just see if you can boot, and also useful if you can unlock the LUKS volume, and mount either home or root. 20200927 workstation image for sure works in qemu-kvm, and has kernel 5.8.11 https://kojipkgs.fedoraproject.org/compose/branched/
(In reply to Chris Murphy from comment #7) > In the failing case, there is this message: > > Sep 28 09:27:08 fedora.local systemd-cryptsetup[470]: WARNING: Locking > directory /run/cryptsetup is missing! > > Do you see it in the successful case? The dm-crypt device is unlocked, and > system root mounts. Yep, in the successful case (kernel 5.6.6), journal shows following lines: 997 │ Sep 29 09:34:31 localhost.localdomain systemd[1]: Starting Cryptography Setup for luks-65d │ 9ed28-ea08-4ea5-a1dd-7b2b086f5e09... 998 │ Sep 29 09:34:31 localhost.localdomain systemd-cryptsetup[544]: WARNING: Locking directory │ /run/cryptsetup is missing! 999 │ Sep 29 09:34:31 localhost.localdomain systemd[1]: Started Forward Password Requests to Ply │ mouth.
Created attachment 1717399 [details] f32 kernel 5.8.11 systemd.debug-shell=1 journal
(In reply to Chris Murphy from comment #6) > The screenshot shows XFS metadata and IO errors. > > None of this is found in the text log files attached, they just stop after > systemd journal is flushed. It sounds like the problem XFS file system was > reformatted so evidence of the original problem maybe gone. > > In the text logs attached, I see no errors. Root has been successfully > mounted, swap is activated, but nothing else is mounted yet. Since sysroot > is read-write and the journal is being flushed - later events should also > have made it into the journal. This can be retrieved using '-b' flag and > choosing a prior boot: -1 for the previous boot, -2 for the boot before > that, etc. > > It might be worth trying to boot with 'systemd.debug-shell=1' boot > parameter, and seeing if you can switch to tty9. In this case you could just: > # journalctl -b > journal-fail.txt > # reboot > > And then boot a working kernel, and attach the failure journal. https://bugzilla.redhat.com/show_bug.cgi?id=1878347#c11 is the journal Here is what I did to get the journal: 1. At grub menu, press 'e' to edit kernel 5.8.11 entry 2. Replace 'rhgh quiet' with 'systemd.debug-shell=1' 3. Press F10 to boot (the Ctrl+x didn't work, but I think it's another topic) 4. Switch to tty9 5. 'mount' shows the '/' was read-only, and 'mount / -o remount,rw' says the device is write-protected (I'll attach a screenshot later) 6. Mount the boot partition to /boot, then 'journalctl -b > /boot/journal-fail.txt'
Created attachment 1717401 [details] edit grub2 entry (replace rhgb quiet with systemd.debug-shell=1)
Created attachment 1717402 [details] systemd.debug-shell=1 LUKS prompt to decrypt
Created attachment 1717403 [details] systemd.debug-shell=1 kernel 5.8.11 boot failed
Sep 29 10:49:28 fedora.local kernel: blk_update_request: I/O error, dev sdb, sector 197394688 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 Sep 29 10:49:28 fedora.local kernel: blk_update_request: I/O error, dev sdb, sector 197394688 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 ... Sep 29 10:49:28 fedora.local kernel: EXT4-fs (dm-1): I/O error while writing superblock ... Sep 29 10:49:28 fedora.local kernel: Buffer I/O error on dev dm-1, logical block 0, lost sync page write This is bad... I can't tell if this is some strange block layer bug we haven't seen before or if it's transient hardware problems. But you have literally no errors at all with kernel 5.8.4?
Created attachment 1717404 [details] systemd.debug-shell=1 tty9 root FS was mounted read only
Created attachment 1717405 [details] systemd.debug-shell=1 tty9 root FS was write protected
The I/O errors are on dev sdb, which I'm pretty sure is this (there are two SD Card Readers reported, sdb and sdc) Sep 29 10:49:17 fedora.local kernel: scsi 1:0:0:0: Direct-Access APPLE SD Card Reader 3.00 PQ: 0 ANSI: 6 I suspect the SD Card is failing, just because they are notoriously unreliable and tend to have transient failures before outright failing. But then I found this. It might be related... commit 1bbd2c5daed796a411f2f9bc33699013aff8726a Author: Manish Narani <manish.narani> Date: Wed Jul 15 00:42:33 2020 +0530 mmc: sdhci-of-arasan: fix timings allocation code This is in 5.8.6 but not 5.8.5.
Created attachment 1717406 [details] f32 kernel 5.8.5 booted successfully (In reply to Chris Murphy from comment #8) > Does the problem start with kernel 5.8.5 or 5.8.6? > > 5.8.5 for fc32 > https://koji.fedoraproject.org/koji/buildinfo?buildID=1601937 > > You'll need these three, make sure to grab the correct arch: > > kernel-5.8.5-200.fc32.x86_64.rpm (info) (download) > kernel-core-5.8.5-200.fc32.x86_64.rpm (info) (download) > kernel-modules-5.8.5-200.fc32.x86_64.rpm (info) (download) kernel-5.8.5-200 can boot successfully
(In reply to Chris Murphy from comment #8) > Does the problem start with kernel 5.8.5 or 5.8.6? > > 5.8.5 for fc32 > https://koji.fedoraproject.org/koji/buildinfo?buildID=1601937 > > You'll need these three, make sure to grab the correct arch: > > kernel-5.8.5-200.fc32.x86_64.rpm (info) (download) > kernel-core-5.8.5-200.fc32.x86_64.rpm (info) (download) > kernel-modules-5.8.5-200.fc32.x86_64.rpm (info) (download) And I downloaded kernel 5.8.6-200 from https://koji.fedoraproject.org/koji/buildinfo?buildID=1605304 then run dnf in kernel-{,core-,modules-}5.8.6-200.fc32.x86_64.rpm It failed to boot. So maybe something different between 5.8.5-200 and 5.8.6-200 cause the failure?
I wonder if there's any interaction going on between brcm (wifi) and the sd card slot. I also see brcmfmac messages interlaced with these errors and the stack trace. There is a problem reading lib firmware, but that might be the result of the mmc patch. Sep 29 10:49:28 fedora.local kernel: brcmfmac 0000:03:00.0: loading /lib/firmware/brcm/brcmfmac43602-pcie.bin failed with error -5 Also in 5.8.6... commit f02889decdc2ebcacb1e51e00829db835e8133d1 Author: Nicolas Saenz Julienne <nsaenzjulienne> Date: Tue Jul 21 13:23:02 2020 +0200 brcmfmac: Set timeout value when configuring power save This could be a coincidence, and distraction. My old 2011 mac, I'm pretty sure the firmware for wifi was also needed for mmcblk/sd card to work correctly. At this point I think it's worth posting an email to the upstream mmc list, summarizing the findings so far, and you've narrowed it down to a change in 5.8.6. Consider including the suspicion (the two commits I've referred to) and you can reference this bug URL in your email. They'll want to know right off the bat that you're booting from SD Card slot, and the first failure messages (c16 should suffice). http://vger.kernel.org/vger-lists.html#linux-mmc I suggest also attaching to this bug report the output from: lspci -vvnn You can reference the first line (each) for SD host controller and the 802.11 wireless, include that in the email to mmc list. If you're feeling adventurous, then you could compile the kernel and try reverting (separately) these commits and see if that fixes the problem, that'd be fairly conclusive. If it doesn't then...it's harder to narrow down what's going on.
(In reply to Pany from comment #21) > It failed to boot. So maybe something different between 5.8.5-200 and 5.8.6-200 cause the failure? Looks like it. The commits I refer to in c22 and c19 are in the 5.8.6 changelog. https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.8.6
(In reply to Chris Murphy from comment #19) > The I/O errors are on dev sdb, which I'm pretty sure is this (there are two > SD Card Readers reported, sdb and sdc) > > Sep 29 10:49:17 fedora.local kernel: scsi 1:0:0:0: Direct-Access APPLE > SD Card Reader 3.00 PQ: 0 ANSI: 6 > > I suspect the SD Card is failing, just because they are notoriously > unreliable and tend to have transient failures before outright failing. But > then I found this. It might be related... > > commit 1bbd2c5daed796a411f2f9bc33699013aff8726a > Author: Manish Narani <manish.narani> > Date: Wed Jul 15 00:42:33 2020 +0530 > > mmc: sdhci-of-arasan: fix timings allocation code > > This is in 5.8.6 but not 5.8.5. My disk layout: $ sudo lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 113G 0 disk ├─sda1 8:1 0 200M 0 part └─sda2 8:2 0 112.2G 0 part sdb 8:16 1 116G 0 disk ├─sdb1 8:17 1 600M 0 part ├─sdb2 8:18 1 600M 0 part /boot/efi ├─sdb3 8:19 1 1G 0 part /boot └─sdb4 8:20 1 113.8G 0 part └─luks-65d9ed28-ea08-4ea5-a1dd-7b2b086f5e09 253:0 0 113.8G 0 crypt ├─fedora_localhost--live-root 253:1 0 70G 0 lvm / ├─fedora_localhost--live-swap 253:2 0 7.8G 0 lvm [SWAP] └─fedora_localhost--live-home 253:3 0 36G 0 lvm /home sda (SATA SSD shipped with Macbook) is for macOS and sdb (an SD card inserted into the Macbook SD Card Reader) is for fedora.
Created attachment 1717469 [details] f32 kernel 5.8.6 systemd.debug-shell=1 journal (In reply to Chris Murphy from comment #22) > I wonder if there's any interaction going on between brcm (wifi) and the sd > card slot. I also see brcmfmac messages interlaced with these errors and the > stack trace. There is a problem reading lib firmware, but that might be the > result of the mmc patch. > > Sep 29 10:49:28 fedora.local kernel: brcmfmac 0000:03:00.0: loading > /lib/firmware/brcm/brcmfmac43602-pcie.bin failed with error -5 > > Also in 5.8.6... > > commit f02889decdc2ebcacb1e51e00829db835e8133d1 > Author: Nicolas Saenz Julienne <nsaenzjulienne> > Date: Tue Jul 21 13:23:02 2020 +0200 > > brcmfmac: Set timeout value when configuring power save > > > This could be a coincidence, and distraction. My old 2011 mac, I'm pretty > sure the firmware for wifi was also needed for mmcblk/sd card to work > correctly. > In order to avoid any misleading info, I booted kernel 5.8.6 with systemd.debug-shell=1, but there was no brcmfmac error. > At this point I think it's worth posting an email to the upstream mmc list, > summarizing the findings so far, and you've narrowed it down to a change in > 5.8.6. Consider including the suspicion (the two commits I've referred to) > and you can reference this bug URL in your email. They'll want to know right > off the bat that you're booting from SD Card slot, and the first failure > messages (c16 should suffice). > http://vger.kernel.org/vger-lists.html#linux-mmc > > I suggest also attaching to this bug report the output from: lspci -vvnn > > You can reference the first line (each) for SD host controller and the > 802.11 wireless, include that in the email to mmc list. > 'lspci -vvnn' shows: ... 00:14.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB xHCI Controller [8 086:9cb1] (rev 03) (prog-if 30 [XHCI]) Subsystem: Intel Corporation Device [8086:7270] ... 03:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba] (rev 01) Subsystem: Apple Inc. Device [106b:0133] ... 03:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba] (rev 01) Subsystem: Apple Inc. Device [106b:0133] ... I wonder if the SD card controller is handled by the USB controller, since 'lsusb -vv' shows: Bus 002 Device 002: ID 05ac:8406 Apple, Inc. Internal Memory Card Reader Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x05ac Apple, Inc. idProduct 0x8406 Internal Memory Card Reader ... > If you're feeling adventurous, then you could compile the kernel and try > reverting (separately) these commits and see if that fixes the problem, > that'd be fairly conclusive. If it doesn't then...it's harder to narrow down > what's going on. To be honest, it's kinda hard for me, but I'll try it, just need some time to search for docs and guide.
(In reply to Chris Murphy from comment #19) > The I/O errors are on dev sdb, which I'm pretty sure is this (there are two > SD Card Readers reported, sdb and sdc) > > Sep 29 10:49:17 fedora.local kernel: scsi 1:0:0:0: Direct-Access APPLE > SD Card Reader 3.00 PQ: 0 ANSI: 6 > > I suspect the SD Card is failing, just because they are notoriously > unreliable and tend to have transient failures before outright failing. But > then I found this. It might be related... > > commit 1bbd2c5daed796a411f2f9bc33699013aff8726a > Author: Manish Narani <manish.narani> > Date: Wed Jul 15 00:42:33 2020 +0530 > > mmc: sdhci-of-arasan: fix timings allocation code > > This is in 5.8.6 but not 5.8.5. Another detail: When trying to boot kernel 5.8.6 with systemd.debug-shell=1 to save the journal, I had to mount the boot partition since the root FS was write-protected. Here comes the wired thing - I expected the SD card I boot from to be /dev/sdb and /dev/sdb3 should be the boot partition, but it turned out that only /dev/sdc3 could be mounted as ext4 FS successfully, and it's exactly the /boot partition as well as the /dev/sdb3 when I running kernel 5.8.5 (the good one). There's only one SATA SSD (sda, macOS) and one SD Card Reader (fedora installed into the SD card) on Macbook. And the journal do show two SD Card Readers - sdb and sdc.
(In reply to Pany from comment #25) > > If you're feeling adventurous, then you could compile the kernel and try > > reverting (separately) these commits and see if that fixes the problem, > > that'd be fairly conclusive. If it doesn't then...it's harder to narrow down > > what's going on. > > To be honest, it's kinda hard for me, but I'll try it, just need some time > to search for docs and guide. I've built a scratch build kernel with the commit in comment 22 reverted. Maybe you can try this one https://koji.fedoraproject.org/koji/taskinfo?taskID=52449450 Note, the scratch build usually expires in 3 days (or 7 days? I don't remember for sure). So you might want to download it to your local disk as soon as possible.
(In reply to Chris Murphy from comment #16) > Sep 29 10:49:28 fedora.local kernel: blk_update_request: I/O error, dev sdb, > sector 197394688 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > Sep 29 10:49:28 fedora.local kernel: blk_update_request: I/O error, dev sdb, > sector 197394688 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > ... > Sep 29 10:49:28 fedora.local kernel: EXT4-fs (dm-1): I/O error while writing > superblock > ... > Sep 29 10:49:28 fedora.local kernel: Buffer I/O error on dev dm-1, logical > block 0, lost sync page write > > This is bad... I can't tell if this is some strange block layer bug we > haven't seen before or if it's transient hardware problems. But you have > literally no errors at all with kernel 5.8.4? Block layer errors, noting to do with the filesystems. Dropping back off cc, sorry ;)
(In reply to Pany from comment #25) > 'lspci -vvnn' shows: > ... > 00:14.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB xHCI > Controller [8 > 086:9cb1] (rev 03) (prog-if 30 [XHCI]) > Subsystem: Intel Corporation Device [8086:7270] > ... > > I wonder if the SD card controller is handled by the USB controller, since > 'lsusb -vv' shows: > > Bus 002 Device 002: ID 05ac:8406 Apple, Inc Internal Memory Card Reader > Device Descriptor: Good catch. There's a bunch of usb related commits in 5.8.6 too. I have no idea which ones are likely suspects. It's probably easiest to do a stable kernel bisect between tags 5.8.5 and 5.8.6 and let git bisect manage everything. > To be honest, it's kinda hard for me, but I'll try it, just need some time > to search for docs and guide. I use this as a starting point: https://kernelnewbies.org/KernelBuild For this case you want the stable tree. And for git bisect, maybe this as a starting point https://thoughtbot.com/blog/git-bisect Best to ask on #fedora-qa or #fedora-devel for cmurf, and I can help you abbreviate the process. For sure you will want to use 'make localmodconfig' which will build a kernel for your system, largely not building things you're not using - so why spend the time compiling those things? As for /dev/sda /dev/sdb /dev/sdc being inconsistent - that's expected. Every boot those could get assigned to different devices.
(In reply to Zamir SUN from comment #27) > I've built a scratch build kernel with the commit in comment 22 reverted. > Maybe you can try this one > https://koji.fedoraproject.org/koji/taskinfo?taskID=52449450 > > Note, the scratch build usually expires in 3 days (or 7 days? I don't > remember for sure). So you might want to download it to your local disk as > soon as possible. Thank you, but the boot failure seems the same...
Just random thought, can you check the initramfs of both the working kernel and the non-functional kernel with the following command, and share the output here? # sudo lsinitrd /boot/initramfs-$(uname -r).img | egrep "module|firmware" For the non-functional kernel, just replace $(uname -r) with the kernel version.
Created attachment 1717728 [details] lsinitrd /boot/initramfs-5.8.5-200.fc32.x86_64.img (the good one) (In reply to Zamir SUN from comment #31) > Just random thought, can you check the initramfs of both the working kernel > and the non-functional kernel with the following command, and share the > output here? > > # sudo lsinitrd /boot/initramfs-$(uname -r).img | egrep "module|firmware" > > For the non-functional kernel, just replace $(uname -r) with the kernel > version. I downloaded kernel 5.8.5-200 from here: https://koji.fedoraproject.org/koji/buildinfo?buildID=1601937 then: sudo dnf in kernel-{,core-,modules-}5.8.5-200.fc32.x86_64.rpm This is the latest kernel I can boot successfully.
Created attachment 1717729 [details] lsinitrd /boot/initramfs-5.8.6-200.fc32.x86_64.img (the failed one) (In reply to Zamir SUN from comment #31) > Just random thought, can you check the initramfs of both the working kernel > and the non-functional kernel with the following command, and share the > output here? > > # sudo lsinitrd /boot/initramfs-$(uname -r).img | egrep "module|firmware" > > For the non-functional kernel, just replace $(uname -r) with the kernel > version. I downloaded kernel 5.8.6-200 from here: https://koji.fedoraproject.org/koji/buildinfo?buildID=1605304 then: sudo dnf in kernel-{,core-,modules-}5.8.6-200.fc32.x86_64.rpm This is the first kernel I can not boot.
(In reply to Chris Murphy from comment #19) > The I/O errors are on dev sdb, which I'm pretty sure is this (there are two > SD Card Readers reported, sdb and sdc) > > Sep 29 10:49:17 fedora.local kernel: scsi 1:0:0:0: Direct-Access APPLE > SD Card Reader 3.00 PQ: 0 ANSI: 6 > > I suspect the SD Card is failing, just because they are notoriously > unreliable and tend to have transient failures before outright failing. But > then I found this. It might be related... > > commit 1bbd2c5daed796a411f2f9bc33699013aff8726a > Author: Manish Narani <manish.narani> > Date: Wed Jul 15 00:42:33 2020 +0530 > > mmc: sdhci-of-arasan: fix timings allocation code > > This is in 5.8.6 but not 5.8.5. I made a patch to revert this commit: $ cat 0001-revert1bbd2c5.patch From bde53877b039a31ac884987e0358a41179686f97 Mon Sep 17 00:00:00 2001 From: Pany <pany> Date: Wed, 30 Sep 2020 14:23:46 +0800 Subject: [PATCH] revert 1bbd2c5 --- drivers/mmc/host/sdhci-of-arasan.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c index d0a80bfb9..fb26e743e 100644 --- a/drivers/mmc/host/sdhci-of-arasan.c +++ b/drivers/mmc/host/sdhci-of-arasan.c @@ -1025,6 +1025,7 @@ static void arasan_dt_read_clk_phase(struct device *dev, static void arasan_dt_parse_clk_phases(struct device *dev, struct sdhci_arasan_clk_data *clk_data) { + int *iclk_phase, *oclk_phase; u32 mio_bank = 0; int i; @@ -1036,32 +1037,28 @@ static void arasan_dt_parse_clk_phases(struct device *dev, clk_data->set_clk_delays = sdhci_arasan_set_clk_delays; if (of_device_is_compatible(dev->of_node, "xlnx,zynqmp-8.9a")) { - u32 zynqmp_iclk_phase[MMC_TIMING_MMC_HS400 + 1] = - ZYNQMP_ICLK_PHASE; - u32 zynqmp_oclk_phase[MMC_TIMING_MMC_HS400 + 1] = - ZYNQMP_OCLK_PHASE; + iclk_phase = (int [MMC_TIMING_MMC_HS400 + 1]) ZYNQMP_ICLK_PHASE; + oclk_phase = (int [MMC_TIMING_MMC_HS400 + 1]) ZYNQMP_OCLK_PHASE; of_property_read_u32(dev->of_node, "xlnx,mio-bank", &mio_bank); if (mio_bank == 2) { - zynqmp_oclk_phase[MMC_TIMING_UHS_SDR104] = 90; - zynqmp_oclk_phase[MMC_TIMING_MMC_HS200] = 90; + oclk_phase[MMC_TIMING_UHS_SDR104] = 90; + oclk_phase[MMC_TIMING_MMC_HS200] = 90; } for (i = 0; i <= MMC_TIMING_MMC_HS400; i++) { - clk_data->clk_phase_in[i] = zynqmp_iclk_phase[i]; - clk_data->clk_phase_out[i] = zynqmp_oclk_phase[i]; + clk_data->clk_phase_in[i] = iclk_phase[i]; + clk_data->clk_phase_out[i] = oclk_phase[i]; } } if (of_device_is_compatible(dev->of_node, "xlnx,versal-8.9a")) { - u32 versal_iclk_phase[MMC_TIMING_MMC_HS400 + 1] = - VERSAL_ICLK_PHASE; - u32 versal_oclk_phase[MMC_TIMING_MMC_HS400 + 1] = - VERSAL_OCLK_PHASE; + iclk_phase = (int [MMC_TIMING_MMC_HS400 + 1]) VERSAL_ICLK_PHASE; + oclk_phase = (int [MMC_TIMING_MMC_HS400 + 1]) VERSAL_OCLK_PHASE; for (i = 0; i <= MMC_TIMING_MMC_HS400; i++) { - clk_data->clk_phase_in[i] = versal_iclk_phase[i]; - clk_data->clk_phase_out[i] = versal_oclk_phase[i]; + clk_data->clk_phase_in[i] = iclk_phase[i]; + clk_data->clk_phase_out[i] = oclk_phase[i]; } } -- 2.26.2 Then build the kernel with this patch, but still not able to boot kernel 5.8.6. I'll try to git bisect other commits related to USB. Thank you Chris!
(In reply to Chris Murphy from comment #29) > Good catch. There's a bunch of usb related commits in 5.8.6 too. I have no > idea which ones are likely suspects. It's probably easiest to do a stable > kernel bisect between tags 5.8.5 and 5.8.6 and let git bisect manage > everything. Finally, I found the only commit to blame: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.8.y&id=53965c79c2dbdc898ffc7fdbab37e920594f5e14 I can successfully boot with kernel 5.8.6 as long as I reverted this commit and rebuild the kernel.rpm
(In reply to Pany from comment #35) > Finally, I found the only commit to blame: > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/ > ?h=linux-5.8.y&id=53965c79c2dbdc898ffc7fdbab37e920594f5e14 > > I can successfully boot with kernel 5.8.6 as long as I reverted this commit > and rebuild the kernel.rpm Great catch! Are you willing to report this to mainline kernel with your findings? That will draw attention for kernel developers and (hopefully) get this under better attention. I think either linux-usb list could be a good place to start with. http://vger.kernel.org/vger-lists.html#linux-usb
(In reply to Zamir SUN from comment #36) > Are you willing to report this to mainline kernel with your findings? That > will draw attention for kernel developers and (hopefully) get this under > better attention. > > I think either linux-usb list could be a good place to start with. > > http://vger.kernel.org/vger-lists.html#linux-usb Thank you, I've sent a message to the linux-usb list. https://marc.info/?l=linux-usb&m=160295084610971&w=2
I've commented on the upstream discussion and prepared 2 patches which I'm compiling and will testing shortly. I'll try to make a scratch build available once I've tested it locally.
(In reply to Pany from comment #35) > (In reply to Chris Murphy from comment #29) > > Good catch. There's a bunch of usb related commits in 5.8.6 too. I have no > > idea which ones are likely suspects. It's probably easiest to do a stable > > kernel bisect between tags 5.8.5 and 5.8.6 and let git bisect manage > > everything. > > Finally, I found the only commit to blame: > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/ > ?h=linux-5.8.y&id=53965c79c2dbdc898ffc7fdbab37e920594f5e14 > > I can successfully boot with kernel 5.8.6 as long as I reverted this commit > and rebuild the kernel.rpm Could you please test with this scratch build? https://koji.fedoraproject.org/koji/taskinfo?taskID=53966375 It contains the 2 patches I sent to the linux-usb list already.
Hello all, I published two patches at [1]. The first patch is modified/updated, and the second patch is an unmodified version of Bastien's second patch. I verified that the patches do not introduce regressions. I do not have an Apple MacBook to verify that the actual issue is resolved, but the issue should be resolved with these patches. Thanks! [1] https://lore.kernel.org/linux-usb/20201022135521.375211-1-m.v.b@runbox.com/T/#m4a37b74da5ecad69b82322847959853daea420ba
(In reply to Bastien Nocera from comment #39) > (In reply to Pany from comment #35) > Could you please test with this scratch build? > > https://koji.fedoraproject.org/koji/taskinfo?taskID=53966375 > > It contains the 2 patches I sent to the linux-usb list already. Bad news, this scratch build does not work as expected. 1. With "systemd.debug-shell=1" appended to cmdline, I tried to capture the journal. But the Macbook built-in keyboard stopped responding when I tried to press F9. 2. Plugging in an extra USB keyboard, I pressed F9 and tried to mount /dev/sdb3 or /dev/sdc3 to /boot in order to save the journal, but neither of them existed. 3. Appending "module_blacklist=apple_mfi_fastcharge" to cmdline, the kernel-5.10.0-0.rc0.20201021git071a0578b0ce.49.local.fc33 booted successfully.
(In reply to Pany from comment #41) > (In reply to Bastien Nocera from comment #39) > > (In reply to Pany from comment #35) > > Could you please test with this scratch build? > > > > https://koji.fedoraproject.org/koji/taskinfo?taskID=53966375 > > > > It contains the 2 patches I sent to the linux-usb list already. > > Bad news, this scratch build does not work as expected. > > 1. With "systemd.debug-shell=1" appended to cmdline, I tried to capture the > journal. But the Macbook built-in keyboard stopped responding when I tried > to press F9. > 2. Plugging in an extra USB keyboard, I pressed F9 and tried to mount > /dev/sdb3 or /dev/sdc3 to /boot in order to save the journal, but neither of > them existed. > 3. Appending "module_blacklist=apple_mfi_fastcharge" to cmdline, the > kernel-5.10.0-0.rc0.20201021git071a0578b0ce.49.local.fc33 booted > successfully. OK. I'll try to setup a reproducer on a MacBook Air I have here, that'll make it easier to test. Thanks.
(In reply to Bastien Nocera from comment #42) <snip> > OK. I'll try to setup a reproducer on a MacBook Air I have here, that'll > make it easier to test. I've managed to reproduce the original problem, though it'll take me a little while to reproduce it again with the patches, and get enough debug to find what the problem could be.
(In reply to Bastien Nocera from comment #43) > (In reply to Bastien Nocera from comment #42) > <snip> > > OK. I'll try to setup a reproducer on a MacBook Air I have here, that'll > > make it easier to test. > > I've managed to reproduce the original problem, though it'll take me a little > while to reproduce it again with the patches, and get enough debug to find > what the problem could be. Hello Bastien, I was able to get my hands onto an Apple MacBook Air, and I confirmed that the two patches that I published at [1] indeed resolve the issue at hand. I am including below dmesg output which shows that 5.6.6-300.fc32 does not exhibit the issue, 5.8.15-201.fc32 exhibits the issue, and a custom-built kernel with the two patches does /not/ exhibit the issue. The patches can be seen here: https://lore.kernel.org/linux-usb/20201022135521.375211-1-m.v.b@runbox.com/T/#m4a37b74da5ecad69b82322847959853daea420ba I hope that this helps! Oct 23 23:31:06 kernel: Linux version 5.6.6-300.fc32.x86_64 (mockbuild.fedoraproject.org) (gcc version 10.0.1 20200328 (Red Hat 10.0.1-0.11) (GCC)) #1 SMP Tue Apr 21 13:44:19 UTC 2020 Oct 23 23:31:08 kernel: scsi 2:0:0:0: Direct-Access APPLE SD Card Reader 3.00 PQ: 0 ANSI: 6 [root@localhost ~]# journalctl -b-1 --no-hostname -k | grep -e 'SD Card Reader' -e 'apple-mfi-fastcharge' -e 'Linux version' Oct 24 01:01:22 kernel: Linux version 5.8.15-201.fc32.x86_64 (mockbuild.fedoraproject.org) (gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), GNU ld version 2.34-5.fc32) #1 SMP Thu Oct 15 15:56:44 UTC 2020 Oct 24 01:01:24 kernel: scsi 2:0:0:0: Direct-Access APPLE SD Card Reader 3.00 PQ: 0 ANSI: 6 Oct 24 01:01:38 kernel: usbcore: registered new device driver apple-mfi-fastcharge Oct 24 01:01:38 kernel: scsi 2:0:0:0: Direct-Access APPLE SD Card Reader 3.00 PQ: 0 ANSI: 6 [root@localhost ~]# journalctl -b0 --no-hostname -k | grep -e 'SD Card Reader' -e 'apple-mfi-fastcharge' -e 'Linux version' Oct 24 01:11:20 kernel: Linux version 5.8.16-1.local (user@builder) (gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), GNU ld version 2.26.1-1.fc25) #64 SMP Sat Oct 24 00:41:24 EDT 2020 Oct 24 01:11:23 kernel: scsi 2:0:0:0: Direct-Access APPLE SD Card Reader 3.00 PQ: 0 ANSI: 6 Oct 24 01:11:52 kernel: usbcore: registered new device driver apple-mfi-fastcharge
My original version of the patch was buggy (should also have changed __check_for_non_generic_match() and __usb_bus_reprobe_drivers()) but Vefa's latest version works for me. The apple-mfi-fastcharge gets autoloaded on a MacBook Air, but nothing uses it by default, and plugging an iOS device gets it matched as expected. A kernel scratch build with Vefa's patches is available here: https://koji.fedoraproject.org/koji/taskinfo?taskID=54314890 Please reply to the mailing-list with "Tested-by" tags, as I'm about to do once you've tested this successfully.
(In reply to Bastien Nocera from comment #45) > My original version of the patch was buggy (should also have changed > __check_for_non_generic_match() and __usb_bus_reprobe_drivers()) > but Vefa's latest version works for me. The apple-mfi-fastcharge gets > autoloaded on a MacBook Air, but > nothing uses it by default, and plugging an iOS device gets it matched as > expected. > > A kernel scratch build with Vefa's patches is available here: > https://koji.fedoraproject.org/koji/taskinfo?taskID=54314890 > > Please reply to the mailing-list with "Tested-by" tags, as I'm about to do > once you've tested this successfully. Thank you, Bastien, this scratch build works very well. And I've replied to the linux-usb mailing-list. Thank you everybody helped me with this!
I upgraded to Fedora 33 (kernel 5.9.8), this issue has already been fixed. So I think this report can be closed. Thanks to everybody again.