Created attachment 1673722 [details] dmesg 1. Please describe the problem: The machine will randomly lock up at times. 2. What is the Version-Release number of the kernel: kernel-5.5.10-200.fc31.x86_64 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : Yes, in Fedora 30. Unknown which version at the time. 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: No. Not sure what triggers it. Happens several times a week though. Sometimes several times a day. 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: Not tested. 6. Are you running any modules that not shipped with directly Fedora's kernel?: No. 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. Attached output of "sudo journalctl -b --no-hostname -k > tmp/dmesg.txt"
The reason we suspect sof-audio-pci is because: 1. These are the only suspicious lines we see at the end of the log before each crash: > Mar 26 09:56:01 aleta kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 > Mar 26 09:56:01 aleta kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete Note that they also occur at other times when the system doesn't lock up. 2. This started happening after we installed alsa-firmware. Audio was non-functional before that.
Some googling found this: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1847250 Are those patches and firmware updates in Fedora 31?
Bug 1787387, Comment 16, has a suggested test. I think he means to run those commands in separate terminal windows.
(In reply to Pierre Ossman from comment #1) ... > 2. This started happening after we installed alsa-firmware. Audio was non-functional before that. I suggest changing the component to alsa-firmware. > The reason we suspect sof-audio-pci is because: > > 1. These are the only suspicious lines we see at the end of the log before > each crash: > > > Mar 26 09:56:01 aleta kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 > > Mar 26 09:56:01 aleta kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete > > Note that they also occur at other times when the system doesn't lock up. >
And please post the output from: $ rpm -qa alsa\* | sort
We've gotten a bit of corona here this week, but we'll get your questions sorted out as soon as we can.
(In reply to Steve from comment #3) > Bug 1787387, Comment 16, has a suggested test. I think he means to run those > commands in separate terminal windows. Unfortunately we couldn't provoke a hang using those commands. What we can see though is that logging in and out somewhat reliable hangs the system at some point (usually right after logout). (In reply to Steve from comment #5) > And please post the output from: > > $ rpm -qa alsa\* | sort alsa-firmware-1.2.1-4.fc31.noarch alsa-lib-1.2.2-1.fc31.i686 alsa-lib-1.2.2-1.fc31.x86_64 alsa-plugins-pulseaudio-1.2.2-1.fc31.x86_64 alsa-tools-firmware-1.2.2-1.fc31.x86_64 alsa-ucm-1.2.2-1.fc31.noarch alsa-utils-1.2.2-1.fc31.x86_64
(In reply to Pierre Ossman from comment #7) > (In reply to Steve from comment #3) > > Bug 1787387, Comment 16, has a suggested test. I think he means to run those > > commands in separate terminal windows. > > Unfortunately we couldn't provoke a hang using those commands. > > What we can see though is that logging in and out somewhat reliable hangs the system at some point (usually right after logout). Thanks for reporting that. That could be a clue. Your system is booting with: Mar 26 10:15:19 kernel: Kernel is locked down from EFI Secure Boot mode; see man kernel_lockdown.7 And the last line in the log is: Mar 26 09:40:40 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 > (In reply to Steve from comment #5) > > And please post the output from: > > > > $ rpm -qa alsa\* | sort > > alsa-firmware-1.2.1-4.fc31.noarch > alsa-lib-1.2.2-1.fc31.i686 > alsa-lib-1.2.2-1.fc31.x86_64 > alsa-plugins-pulseaudio-1.2.2-1.fc31.x86_64 > alsa-tools-firmware-1.2.2-1.fc31.x86_64 > alsa-ucm-1.2.2-1.fc31.noarch > alsa-utils-1.2.2-1.fc31.x86_64 Thanks. Do you know whether this i686 package is needed: alsa-lib-1.2.2-1.fc31.i686 Also, there is a newer version of alsa-firmware available: $ dnf -q repoquery alsa-firmware --repo=updates-testing alsa-firmware-0:1.2.1-5.fc31.noarch Update with: # dnf update alsa-firmware --enablerepo=updates-testing
(In reply to Steve from comment #8) ... > Update with: > > # dnf update alsa-firmware --enablerepo=updates-testing Let's get a kernel update too: # dnf list alsa-firmware kernel --enablerepo=updates-testing --refresh NB: There is another kernel in the pipeline: kernel-5.5.15-200.fc31, kernel-headers-5.5.15-200.fc31, & 1 more https://bodhi.fedoraproject.org/updates/FEDORA-2020-666f3b1ac3
# dnf update alsa-firmware kernel --enablerepo=updates-testing --refresh ^^^^^^
We got a second machine experiencing this now after a kernel upgrade (same alsa-firmware though). It gets the same "no reply expected" errors, but it also got this: > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: HDA Analog: ASoC: hw_params BE failed -110 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: Analog Playback and Capture: ASoC: hw_params BE failed -110 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: ASoC: can't set Analog CPU DAI hw params: -110 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: failed to set dai config for Analog Playback and Capture > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: dsp irq ppsts 0x00000000 adspis 0x00000000 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: hda irq intsts 0x00000000 intlctl 0xc0000180 rirb 00 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: unexpected fault 0x00000000 trace 0x00000005 > Apr 06 08:35:32 samuel.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: error: ipc timed out for 0x80010000 size 120 This machine also tends to lock up directly on boot, not just on logout. We tested upgrading to that kernel and alsa-firmware from updates-testing, but unfortunately the problem remains. Also note that reverting to the old driver makes the problem go away, so that's another thing pointing at the sof driver being the culprit.
Thanks for testing with the newer kernel and firmware. Could you please attach a log for the latest kernel and firmware. If you have to reboot, please attach the log from the previous boot: $ journalctl -b -1 -k --no-hostname > dmesg-2.txt ^^-- selects the log for the previous boot Also, I believe this bug would be better placed under one of the "alsa" components. This looks like a good choice: "alsa-sof-firmware". BTW, this looks similar: Bug 1812415 - sof-audio-pci panic for intel ice lake, probably at resume
Created attachment 1676864 [details] dmesg-2.txt Added an attachment with the logs from the previous boot with the 5.5.15 kernel (which locked up when booting).
(In reply to Samuel from comment #13) > Created attachment 1676864 [details] > dmesg-2.txt > > Added an attachment with the logs from the previous boot with the 5.5.15 kernel (which locked up when booting). Thanks. The log is truncated, so we will have to try getting something from the display instead. Try removing "rhgb quiet" from the kernel command-line in grub2. If it hangs with any messages on the display, please take a photo and attach it. If you don't see the grub2 menu while booting, try repeatedly tapping the "Esc" key while booting.
Unfortunately it hangs with a blank screen, I can't get it do display the text it ends with. I made a video: https://www.youtube.com/watch?v=0ezKDxfUq7Q&feature=youtu.be
(In reply to Samuel from comment #15) > Unfortunately it hangs with a blank screen, I can't get it do display the text it ends with. > > I made a video: https://www.youtube.com/watch?v=0ezKDxfUq7Q&feature=youtu.be Thanks. I played the video at 0.25 speed and in full-screen mode. The compression artifacts make the text nearly unreadable, but I believe that it can be interpreted as follows: The lines with "Starting" and "Started" are from systemd. Near the end, I believe I see: "Started Command Scheduler." (I copied that text from the "journalctl -b" output in my F31 VM.) Then there are five lines, with the first line containing the text "available by default". A search of your first log finds what I believe is that line. I believe that the last line visible in the video is the one identified below. $ fgrep -A26 -n 'no longer available by default.' dmesg-1.txt 1033:Mar 26 09:15:29 kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. 1034-Mar 26 09:15:29 kernel: tun: Universal TUN/TAP device driver, 1.6 1035-Mar 26 09:15:29 kernel: virbr0: port 1(virbr0-nic) entered blocking state 1036-Mar 26 09:15:29 kernel: virbr0: port 1(virbr0-nic) entered disabled state 1037-Mar 26 09:15:29 kernel: device virbr0-nic entered promiscuous mode <<<<< Last line displayed in video before display goes blank. 1038-Mar 26 09:15:29 kernel: virbr0: port 1(virbr0-nic) entered blocking state 1039-Mar 26 09:15:29 kernel: virbr0: port 1(virbr0-nic) entered listening state 1040-Mar 26 09:15:29 kernel: virbr0: port 1(virbr0-nic) entered disabled state 1041-Mar 26 09:15:29 kernel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details. 1042-Mar 26 09:15:29 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 1043-Mar 26 09:15:29 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 1044-Mar 26 09:15:29 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 1045-Mar 26 09:15:31 kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 1046-Mar 26 09:15:31 kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete 1047-Mar 26 09:15:32 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 1048-Mar 26 09:15:32 kernel: rfkill: input handler disabled 1049-Mar 26 09:15:33 kernel: FS-Cache: Loaded 1050-Mar 26 09:15:33 kernel: FS-Cache: Netfs 'nfs' registered for caching 1051-Mar 26 09:15:47 kernel: rfkill: input handler enabled 1052-Mar 26 09:15:47 kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 1053-Mar 26 09:15:47 kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete 1054-Mar 26 09:15:48 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 1055-Mar 26 09:15:48 kernel: rfkill: input handler disabled 1056-Mar 26 09:19:54 kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 1057-Mar 26 09:19:54 kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete 1058-Mar 26 09:22:15 kernel: sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0 1059-Mar 26 09:22:15 kernel: sof-audio-pci 0000:00:1f.3: firmware boot complete
(In reply to Samuel from comment #15) ... > I made a video: https://www.youtube.com/watch?v=0ezKDxfUq7Q&feature=youtu.be For the record, YouTube videos can be stepped frame-by-frame with the "." (forward) and "," (backward) keys. Pause the video near where you want to view each frame before stepping. Thanks to this web page: Frame by frame control of a YouTube video? https://webapps.stackexchange.com/questions/6507/frame-by-frame-control-of-a-youtube-video
Could you please post the output from: $ lsmod That will give us a list of kernel modules that we can try blacklisting. Unfortunately, sof-audio-pci is not a kernel module: $ modinfo sof-audio-pci modinfo: ERROR: Module sof-audio-pci not found. If we can find a "sof" kernel module name, it would be blacklisted on the kernel command-line by adding something like this: module_blacklist=xxx-yyy ^^^^^^^ The name of a kernel module would go there. Documentation: The kernel’s command-line parameters https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html For reference, here is a list of installed "sof" kernel modules from my F31 VM: $ find /lib/modules/5.5.15-200.fc31.x86_64 -wholename \*/sof/\* /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/xtensa /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/xtensa/snd-sof-xtensa-dsp.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/intel /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/intel/snd-sof-intel-byt.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/intel/snd-sof-intel-hda-common.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/intel/snd-sof-intel-hda.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/intel/snd-sof-intel-ipc.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/snd-sof-acpi.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/snd-sof.ko.xz /lib/modules/5.5.15-200.fc31.x86_64/kernel/sound/soc/sof/snd-sof-pci.ko.xz
(In reply to Pierre Ossman from comment #11) ... > Also note that reverting to the old driver makes the problem go away, ... Could you clarify what you mean by "reverting to the old driver"? Are you referring to the firmware? If so, please post the version info for the firmware packages that work for you: $ rpm -q alsa-firmware linux-firmware
(In reply to Steve from comment #19) > > Could you clarify what you mean by "reverting to the old driver"? Are you > referring to the firmware? If so, please post the version info for the > firmware packages that work for you: > No. We're reverting to snd_hda_intel by specifying: > options snd-intel-dspcfg dsp_driver=1
(In reply to Steve from comment #18) > Could you please post the output from: > > $ lsmod > Sure: $ lsmod Module Size Used by rfcomm 90112 4 nfsv3 53248 1 nfs_acl 16384 1 nfsv3 nfs 364544 2 nfsv3 lockd 122880 2 nfsv3,nfs grace 16384 1 lockd fscache 397312 1 nfs uinput 20480 0 xt_CHECKSUM 16384 1 xt_MASQUERADE 20480 3 nf_nat_tftp 16384 0 nf_conntrack_tftp 20480 3 nf_nat_tftp tun 57344 1 bridge 208896 0 stp 16384 1 bridge llc 16384 2 bridge,stp nf_conntrack_netbios_ns 16384 1 nf_conntrack_broadcast 16384 1 nf_conntrack_netbios_ns xt_CT 16384 3 ip6t_REJECT 16384 12 nf_reject_ipv6 20480 1 ip6t_REJECT ip6t_rpfilter 16384 1 ipt_REJECT 16384 5 nf_reject_ipv4 16384 1 ipt_REJECT xt_conntrack 16384 44 ebtable_nat 16384 1 ebtable_broute 16384 1 ip6table_nat 16384 1 ip6table_mangle 16384 1 ip6table_raw 16384 1 ip6table_security 16384 1 iptable_nat 16384 1 nf_nat 53248 4 ip6table_nat,nf_nat_tftp,iptable_nat,xt_MASQUERADE iptable_mangle 16384 1 iptable_raw 16384 1 iptable_security 16384 1 nf_conntrack 163840 8 xt_conntrack,nf_nat,nf_conntrack_tftp,nf_conntrack_netbios_ns,nf_nat_tftp,nf_conntrack_broadcast,xt_CT,xt_MASQUERADE nf_defrag_ipv6 24576 1 nf_conntrack nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 2 nf_conntrack,nf_nat ip_set 57344 0 nfnetlink 16384 1 ip_set ebtable_filter 16384 1 ebtables 40960 3 ebtable_nat,ebtable_filter,ebtable_broute ip6table_filter 16384 1 ip6_tables 36864 7 ip6table_filter,ip6table_raw,ip6table_nat,ip6table_mangle,ip6table_security iptable_filter 16384 1 cmac 16384 9 bnep 28672 2 snd_sof_pci 20480 0 snd_sof_intel_byt 20480 1 snd_sof_pci snd_sof_intel_ipc 20480 1 snd_sof_intel_byt snd_sof_xtensa_dsp 16384 1 snd_sof_pci snd_sof_intel_hda_common 81920 1 snd_sof_pci snd_hda_codec_hdmi 73728 1 snd_soc_hdac_hda 24576 1 snd_sof_intel_hda_common snd_sof_intel_hda 20480 1 snd_sof_intel_hda_common snd_sof 126976 4 snd_sof_pci,snd_sof_intel_hda_common,snd_sof_intel_byt,snd_sof_intel_ipc intel_rapl_msr 20480 0 sunrpc 491520 18 lockd,nfsv3,nfs_acl,nfs intel_rapl_common 32768 1 intel_rapl_msr snd_soc_skl 180224 0 snd_soc_sst_ipc 20480 1 snd_soc_skl snd_hda_codec_realtek 126976 1 snd_soc_sst_dsp 40960 1 snd_soc_skl snd_hda_ext_core 36864 4 snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl,snd_sof_intel_hda x86_pkg_temp_thermal 20480 0 snd_soc_acpi_intel_match 32768 3 snd_sof_pci,snd_sof_intel_hda_common,snd_soc_skl intel_powerclamp 20480 0 snd_soc_acpi 16384 3 snd_sof_pci,snd_soc_acpi_intel_match,snd_soc_skl snd_hda_codec_generic 94208 1 snd_hda_codec_realtek coretemp 20480 0 vfat 20480 1 ledtrig_audio 16384 3 snd_hda_codec_generic,snd_hda_codec_realtek,snd_sof snd_soc_core 294912 4 snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl fat 86016 1 vfat kvm_intel 311296 0 snd_compress 28672 1 snd_soc_core ac97_bus 16384 1 snd_soc_core snd_pcm_dmaengine 16384 1 snd_soc_core snd_hda_intel 53248 3 snd_intel_dspcfg 28672 4 snd_hda_intel,snd_sof_pci,snd_sof_intel_hda_common,snd_soc_skl btusb 57344 0 snd_hda_codec 163840 5 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek,snd_soc_hdac_hda btrtl 24576 1 btusb kvm 802816 1 kvm_intel mei_hdcp 24576 0 mei_wdt 16384 0 btbcm 16384 1 btusb iTCO_wdt 16384 0 btintel 32768 1 btusb iTCO_vendor_support 16384 1 iTCO_wdt irqbypass 16384 1 kvm bluetooth 634880 33 btrtl,btintel,btbcm,bnep,btusb,rfcomm snd_hda_core 102400 10 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_ext_core,snd_hda_codec,snd_hda_codec_realtek,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl,snd_sof_intel_hda crct10dif_pclmul 16384 1 snd_hwdep 16384 1 snd_hda_codec crc32_pclmul 16384 0 snd_seq 86016 0 snd_seq_device 16384 1 snd_seq ghash_clmulni_intel 16384 0 snd_pcm 126976 9 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_sof,snd_sof_intel_hda_common,snd_soc_core,snd_soc_skl,snd_hda_core,snd_pcm_dmaengine intel_cstate 20480 0 intel_uncore 147456 0 ecdh_generic 16384 2 bluetooth rtsx_usb_ms 24576 0 rfkill 28672 4 bluetooth intel_rapl_perf 20480 0 snd_timer 40960 2 snd_seq,snd_pcm wmi_bmof 16384 0 intel_wmi_thunderbolt 20480 0 i2c_i801 32768 0 memstick 24576 1 rtsx_usb_ms ecc 36864 1 ecdh_generic snd 102400 18 snd_hda_codec_generic,snd_seq,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_timer,snd_compress,snd_soc_core,snd_pcm joydev 28672 0 soundcore 16384 1 snd mei_me 45056 2 mei 122880 5 mei_wdt,mei_hdcp,mei_me ie31200_edac 16384 0 acpi_pad 184320 0 ip_tables 32768 5 iptable_filter,iptable_security,iptable_raw,iptable_nat,iptable_mangle rtsx_usb_sdmmc 32768 0 mmc_core 184320 1 rtsx_usb_sdmmc hid_logitech_hidpp 45056 0 hid_microsoft 16384 0 rtsx_usb 28672 2 rtsx_usb_sdmmc,rtsx_usb_ms ff_memless 20480 1 hid_microsoft i915 2453504 5 i2c_algo_bit 16384 1 i915 drm_kms_helper 233472 1 i915 nvme 49152 3 nvme_core 110592 5 nvme drm 585728 6 drm_kms_helper,i915 e1000e 290816 0 crc32c_intel 24576 5 wmi 36864 2 intel_wmi_thunderbolt,wmi_bmof video 53248 1 i915 hid_logitech_dj 28672 0 fuse 139264 1
(In reply to Pierre Ossman from comment #20) ... > No. We're reverting to snd_hda_intel by specifying: > > > options snd-intel-dspcfg dsp_driver=1 Thanks. I had to look that up. Where are you specifying that? And where did you get that option? For the record: $ modinfo snd_intel_dspcfg | egrep 'description|parm' description: Intel DSP config driver parm: dsp_driver:Force the DSP driver for Intel DSP (0=auto, 1=legacy, 2=SST, 3=SOF) (int)
(In reply to Steve from comment #22) > (In reply to Pierre Ossman from comment #20) > ... > > No. We're reverting to snd_hda_intel by specifying: > > > > > options snd-intel-dspcfg dsp_driver=1 > > Thanks. I had to look that up. Where are you specifying that? And where did > you get that option? > We put it in /etc/modprobe.d/snd_intel.conf. Found it by examining snd_hda_intel and figuring out why it wasn't loading. Its modinfo output refers to that new setting: > parm: dmic_detect:Allow DSP driver selection (bypass this driver) (0=off, 1=on) (default=1); deprecated, use snd-intel-dspcfg.dsp_driver option instead (bool)
(In reply to Pierre Ossman from comment #23) > (In reply to Steve from comment #22) > > (In reply to Pierre Ossman from comment #20) > > ... > > > No. We're reverting to snd_hda_intel by specifying: > > > > > > > options snd-intel-dspcfg dsp_driver=1 > > > > Thanks. I had to look that up. Where are you specifying that? And where did you get that option? > > > > We put it in /etc/modprobe.d/snd_intel.conf. Found it by examining snd_hda_intel and > figuring out why it wasn't loading. Its modinfo output refers to that new setting: > > > parm: dmic_detect:Allow DSP driver selection (bypass this driver) (0=off, 1=on) (default=1); deprecated, use snd-intel-dspcfg.dsp_driver option instead (bool) Thanks. Could you post: $ cat /etc/modprobe.d/snd_intel.conf Have you added any other custom files to /etc/modprobe.d/ ?
(In reply to Pierre Ossman from comment #21) ... > Sure: > > $ lsmod > Module Size Used by ... Thanks. That gives us a lot of choices for blacklisting. :-) For blacklist testing, please disable the option in /etc/modprobe.d/snd_intel.conf: # options snd-intel-dspcfg dsp_driver=1 Please try adding this to the kernel command-line from grub2 and seeing if the hang occurs: module_blacklist=snd_sof_pci For the record: $ grep '^snd' lsmod-1.txt | wc -l 30 $ grep '^snd.*0$' lsmod-1.txt snd_sof_pci 20480 0 snd_soc_skl 180224 0 snd_seq 86016 0 $ sort lsmod-1.txt | grep pci snd_intel_dspcfg 28672 4 snd_hda_intel,snd_sof_pci,snd_sof_intel_hda_common,snd_soc_skl snd_soc_acpi 16384 3 snd_sof_pci,snd_soc_acpi_intel_match,snd_soc_skl snd_soc_acpi_intel_match 32768 3 snd_sof_pci,snd_sof_intel_hda_common,snd_soc_skl snd_sof 126976 4 snd_sof_pci,snd_sof_intel_hda_common,snd_sof_intel_byt,snd_sof_intel_ipc snd_sof_intel_byt 20480 1 snd_sof_pci snd_sof_intel_hda_common 81920 1 snd_sof_pci snd_sof_pci 20480 0 snd_sof_xtensa_dsp 16384 1 snd_sof_pci
This is getting out of my lane, but it might help debug the problem. Could you post the output from: $ lspci -nn # That will show "Show PCI vendor and device codes as both numbers and names." I believe the relevant kernel source file is: sound/soc/sof/sof-pci-dev.c This is a commit that changed the names of certain firmware files in that file: ASoC: SOF: Intel: Fix CFL and CML FW nocodec binary names. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/sound/soc/sof/sof-pci-dev.c?h=linux-5.5.y&id=653c28afa76b45c570370c3c3a89975c68c5fc8e But when I look at what alsa-firmware has in /lib/firmware/intel/sof, there is this pair of links: sof-cfl.ri -> sof-cnl.ri sof-cnl.ri -> signed/sof-cnl-signed-intel.ri If I am interpreting the patch correctly, it has no effect whatsoever, because sof-cfl.ri and sof-cnl.ri both go to the same file. $ rpm -q alsa-firmware alsa-firmware-1.2.1-5.fc31.noarch Here is a bigger view of the patch. Specifically, look at the block under CONFIG_SND_SOC_SOF_COFFEELAKE: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/sound/soc/sof/sof-pci-dev.c?h=linux-5.5.y&id=653c28afa76b45c570370c3c3a89975c68c5fc8e&context=8&ignorews=0&dt=1
To add to the confusion, the log doesn't show the actual name of the firmware file that the code is loading, unless something goes wrong. From the attached dmesg.txt: Mar 26 09:15:22 kernel: sof-audio-pci 0000:00:1f.3: Firmware info: version 1:1:0-fcf6c Mar 26 09:15:22 kernel: sof-audio-pci 0000:00:1f.3: Firmware: ABI 3:11:0 Kernel ABI 3:11:0 There is no firmware file name in that! However, an earlier bug report shows the firmware file name: Jan 15 11:16:45 ossman.lkpg.cendio.se kernel: sof-audio-pci 0000:00:1f.3: Direct firmware load for intel/sof/sof-cfl.ri failed with error -2 ^^^^^^^^^^ That log line is from: Bug 1791597 - alsa-firmware missing on upgrade
(In reply to Steve from comment #24) > Thanks. Could you post: > > $ cat /etc/modprobe.d/snd_intel.conf > > Have you added any other custom files to /etc/modprobe.d/ ? $ cat /etc/modprobe.d/snd_intel.conf options snd-intel-dspcfg dsp_driver=1 No other custom files in /etc/modprobe.d/: $ ls /etc/modprobe.d/ blacklist-visor.conf kvm.conf mlx4.conf truescale.conf firewalld-sysctls.conf lockd.conf snd_intel.conf
(In reply to Steve from comment #25) > For blacklist testing, please disable the option in > /etc/modprobe.d/snd_intel.conf: > > # options snd-intel-dspcfg dsp_driver=1 > > Please try adding this to the kernel command-line from grub2 and seeing if > the hang occurs: > > module_blacklist=snd_sof_pci That helped. First I disabled the the workaround in /etc/modprobe.d/snd_intel.conf: $ cat /etc/modprobe.d/snd_intel.conf #options snd-intel-dspcfg dsp_driver=1 And I could verify that the hang still happens with the updated kernel: $ uname -r 5.6.6-200.fc31.x86_64 I killed the power and booted up again. While booting the second time, I edited the kernel command-line to add "module_blacklist=snd_sof_pci" and then the machine booted without problems, no hang.
(In reply to Steve from comment #26) > This is getting out of my lane, but it might help debug the problem. Could > you post the output from: > > $ lspci -nn # That will show "Show PCI vendor and device codes as both > numbers and names." $ lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers [8086:3ec2] (rev 07) 00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 630 (Desktop) [8086:3e92] 00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911] 00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller [8086:a36d] (rev 10) 00:14.2 RAM memory [0500]: Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] (rev 10) 00:16.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] (rev 10) 00:16.3 Serial controller [0700]: Intel Corporation Cannon Lake PCH Active Management Technology - SOL [8086:a363] (rev 10) 00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10) 00:1b.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 [8086:a32c] (rev f0) 00:1f.0 ISA bridge [0601]: Intel Corporation Q370 Chipset LPC/eSPI Controller [8086:a306] (rev 10) 00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10) 00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10) 00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller [8086:a324] (rev 10) 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (7) I219-LM [8086:15bb] (rev 10) 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
(In reply to Samuel from comment #29) > ... I edited the kernel command-line to add "module_blacklist=snd_sof_pci" and then the machine booted without problems, no hang. Thanks for your careful testing and thorough documentation. That, at least, provides another workaround. > 00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10) ^^^^ Thanks for posting the lspci output. Despite what everything else says, "a348" is Coffee Lake, which correlates with the name of the firmware file: sof-cfl.ri. (Comment 27) == $ git grep -C1 'a348' sound/soc/sof/sof-pci-dev.c sound/soc/sof/sof-pci-dev.c-#if IS_ENABLED(CONFIG_SND_SOC_SOF_COFFEELAKE) sound/soc/sof/sof-pci-dev.c: { PCI_DEVICE(0x8086, 0xa348), sound/soc/sof/sof-pci-dev.c- .driver_data = (unsigned long)&cfl_desc}, $ git log --oneline -n1 HEAD 6a8b55ed40 (HEAD -> master, tag: v5.7-rc3, origin/master, origin/HEAD) Linux 5.7-rc3
This would probably get more attention under an alsa component, so I suggest: 1. Changing the bug summary to: "snd_sof_pci: boot hang with Coffee Lake [8086:a348] and 5.6.6-200.fc31.x86_64" 2. Changing the component to: "alsa-sof-firmware". For the record: $ grep -B9 'sound/soc/sof' MAINTAINERS SOUND - SOUND OPEN FIRMWARE (SOF) DRIVERS ... L: sound-open-firmware (moderated for non-subscribers) S: Supported W: https://github.com/thesofproject/linux/ F: sound/soc/sof/
Reported to the sof project: https://github.com/thesofproject/linux/issues/2066
(In reply to Samuel from comment #29) > > ... > > > > Please try adding this to the kernel command-line from grub2 and seeing if > > the hang occurs: > > > > module_blacklist=snd_sof_pci > > ... > > I killed the power and booted up again. While booting the second time, I > edited the kernel command-line to add "module_blacklist=snd_sof_pci" and > then the machine booted without problems, no hang. Note that this method causes audio to stop working.
I just looked again at the alsa-firmware updates and can see that the F31 firmware is *older* than the F32 firmware. Further, there is a separate alsa-sof-firmware package: https://bodhi.fedoraproject.org/updates/?packages=alsa-firmware Installing these might be needed: alsa-firmware-1.2.1-7.fc32 and alsa-sof-firmware-1.4.2-4.fc32 https://bodhi.fedoraproject.org/updates/FEDORA-2020-0a7be7437f Jaroslav: Is there any reason why those can't be released for F31?
(In reply to Samuel from comment #29) ... > And I could verify that the hang still happens with the updated kernel: > > $ uname -r > 5.6.6-200.fc31.x86_64 ... Ranjani, upstream, cites commit 8354d9b44530b.* However, that commit is not yet in a kernel stable release: $ git describe 8354d9b44530b fatal: Not a valid object name 8354d9b44530b $ git log --oneline --no-decorate -n1 63c3d4974 Linux 5.6.8 So you would need to test with a "rawhide" kernel (currently F33) from here: https://bodhi.fedoraproject.org/updates/?packages=kernel See, however: Bug 1825940 - kernel-core-5.7.0-0.rc1.20200416git9786cab67457.1 took very long time to install * https://github.com/thesofproject/linux/issues/2066#issuecomment-621309596
It looks like Jaroslav has just completed a build of alsa-sof-firmware for F31: alsa-sof-firmware https://koji.fedoraproject.org/koji/packageinfo?packageID=30869 Information for build alsa-sof-firmware-1.4.2-5.fc31 https://koji.fedoraproject.org/koji/buildinfo?buildID=1502856 Thank-you, Jaroslav. Upstream reference: https://github.com/thesofproject/linux/issues/2066#issuecomment-623534966
alsa-sof-firmware-1.5-1 packages have been built: https://koji.fedoraproject.org/koji/packageinfo?packageID=30869 Jaroslav: Could you push the F31 package: https://bodhi.fedoraproject.org/updates/?packages=alsa-sof-firmware
FEDORA-2020-18cc548263 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-18cc548263
FEDORA-2020-18cc548263 has been pushed to the Fedora 31 testing repository. In short time you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-18cc548263` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-18cc548263 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
I removed my workaround and upgraded to Fedora 32, but that did not solve the issue. However, I then installed alsa-sof-firmware-1.5-1 from updates-testing, and now it works! Thanks for fixing this!
(In reply to Samuel from comment #41) > I removed my workaround and upgraded to Fedora 32, but that did not solve the issue. > > However, I then installed alsa-sof-firmware-1.5-1 from updates-testing, and now it works! Thanks for fixing this! Thanks for doing those before-and-after tests.
alsa-firmware-1.2.1-6.fc31, alsa-sof-firmware-1.5-1.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.