Bug 919289
Summary: | kernel 3.8.X trows exceptions on a dell precision m6500 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Manuel L. Gonzalez-Garay <mlggaray> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 18 | CC: | allen, gansalmon, itamar, jforbes, jonathan, kernel-maint, madhu.chinakonda, mike |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-04-03 04:22:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Manuel L. Gonzalez-Garay
2013-03-08 01:58:37 UTC
The problem is more serious that I thought. The kernel 3.8.1-201.fc18.x86_64 is not usable in my box. The kernel constantly send messages to the var/log/messages file until the root partition is totally full, in my case created a 82 Gb message file and crashed my system. The message is the same that I pasted before Mar 7 19:35:06 manuelLapTop kernel: [ 37.063036] whci-hcd 0000:0d:00.0-1: wusbhc_rh_suspend (ffff88020bad4000 [ffff88020bad4000]) UNIMPLEMENTED One thing that I did not mentioned before is that my disk is encrypted with LUKS. Kernel version 3.7.9-205.fc18.x86_64 #1 SMP is not generating any errors, I am downgraded my kernel. Best, Manuel I just want to add an update, the latest stable version for my machine is kernel 3.7.9-205.fc18.x86_64. I have tried all the 3.8.X versions of the kernel up to the 3.8.3-203, and the same problem happen in all the 3.8.X, the problem appears as soon as I boot up the computer and ask to display the boot prompt, the UNIMPLEMENTED message appears in the screen in a loop. Looks like the bug was introduced in 3.8.1 and continue in all the other version, it is may be specific for my architecture but I am surprise that nobody else have seen this error. Best, Manuel This is interesting, wusbhc_rh_suspend has been UNIMPLEMENTED since the code first went in in 2008, and there do not appear to be any new callers this year. Can you give me the ouput of lsmod on this system? Also, does adding: SUSPEND_MODULES="whci_hcd" to /etc/pm/config.d/unload_modules solve this issue for you in the meantime? Hello Justin, Here is my lsmod, I will test the suggestion shortly. Best, Manuel Module Size Used by rfcomm 68965 4 fuse 78033 3 ebtable_nat 12808 0 bnep 19702 2 bluetooth 319586 10 bnep,rfcomm vboxpci 23195 0 vboxnetadp 25671 0 vboxnetflt 23480 0 vboxdrv 296323 3 vboxnetadp,vboxnetflt,vboxpci be2iscsi 93412 0 iscsi_boot_sysfs 15642 1 be2iscsi bnx2i 54715 0 cnic 66882 1 bnx2i uio 19045 1 cnic cxgb4i 32880 0 cxgb4 113494 1 cxgb4i cxgb3i 32951 0 cxgb3 155602 1 cxgb3i mdio 13436 1 cxgb3 libcxgbi 56493 2 cxgb3i,cxgb4i ib_iser 37806 0 rdma_cm 42167 1 ib_iser ib_addr 13786 1 rdma_cm iw_cm 18222 1 rdma_cm ib_cm 41726 1 rdma_cm ib_sa 32956 2 rdma_cm,ib_cm ib_mad 46341 2 ib_cm,ib_sa ib_core 74065 6 rdma_cm,ib_cm,ib_sa,iw_cm,ib_mad,ib_iser iscsi_tcp 18334 0 libiscsi_tcp 24177 4 cxgb3i,cxgb4i,iscsi_tcp,libcxgbi libiscsi 50543 8 libiscsi_tcp,bnx2i,cxgb3i,cxgb4i,be2iscsi,iscsi_tcp,ib_iser,libcxgbi scsi_transport_iscsi 57491 8 bnx2i,be2iscsi,iscsi_tcp,ib_iser,libcxgbi,libiscsi ipt_MASQUERADE 12881 1 nf_conntrack_netbios_ns 12666 0 nf_conntrack_broadcast 12528 1 nf_conntrack_netbios_ns ip6table_mangle 12701 1 ip6t_REJECT 12940 2 nf_conntrack_ipv6 18624 27 nf_defrag_ipv6 18206 1 nf_conntrack_ipv6 iptable_nat 13012 1 nf_nat_ipv4 13200 1 iptable_nat nf_nat 25642 3 ipt_MASQUERADE,nf_nat_ipv4,iptable_nat iptable_mangle 12696 1 nf_conntrack_ipv4 14809 24 nf_defrag_ipv4 12674 1 nf_conntrack_ipv4 xt_conntrack 12761 50 nf_conntrack 84256 9 nf_conntrack_netbios_ns,ipt_MASQUERADE,nf_nat,nf_nat_ipv4,xt_conntrack,nf_conntrack_broadcast,iptable_nat,nf_conntrack_ipv4,nf_conntrack_ipv6 ebtable_filter 12828 0 ebtables 30758 2 ebtable_nat,ebtable_filter ip6table_filter 12816 1 ip6_tables 26943 2 ip6table_filter,ip6table_mangle binfmt_misc 17464 1 uvcvideo 80925 0 videobuf2_vmalloc 12968 1 uvcvideo videobuf2_memops 13391 1 videobuf2_vmalloc videobuf2_core 34281 1 uvcvideo videodev 120901 2 uvcvideo,videobuf2_core media 20445 2 uvcvideo,videodev nvidia 11283758 57 snd_hda_codec_idt 70217 1 arc4 12616 2 snd_hda_intel 37938 3 snd_hda_codec 131731 2 snd_hda_codec_idt,snd_hda_intel iwldvm 241608 0 snd_hwdep 17651 1 snd_hda_codec mac80211 540054 1 iwldvm iTCO_wdt 13481 0 i7core_edac 24221 0 snd_seq 64878 0 snd_seq_device 14137 1 snd_seq snd_pcm 98005 2 snd_hda_codec,snd_hda_intel i2c_i801 18135 0 iTCO_vendor_support 13420 1 iTCO_wdt iwlwifi 103188 1 iwldvm cfg80211 201717 3 iwlwifi,mac80211,iwldvm whc_rc 13060 0 lpc_ich 17062 0 edac_core 56456 1 i7core_edac coretemp 13394 0 snd_page_alloc 18269 2 snd_pcm,snd_hda_intel dell_laptop 17370 0 whci_hcd 32616 0 tg3 149165 0 tifm_7xx1 13372 0 whci 12751 2 whci_hcd,whc_rc umc 14177 3 whci,whci_hcd,whc_rc wusbcore 40194 1 whci_hcd uwb 72916 3 whci_hcd,wusbcore,whc_rc i2c_core 38354 3 i2c_i801,nvidia,videodev rfkill 21737 4 cfg80211,bluetooth snd_timer 28691 2 snd_pcm,snd_seq snd 79380 14 snd_hwdep,snd_timer,snd_hda_codec_idt,snd_pcm,snd_seq,snd_hda_codec,snd_hda_intel,snd_seq_device tifm_core 15027 1 tifm_7xx1 mfd_core 13183 1 lpc_ich soundcore 14492 1 snd dcdbas 14829 1 dell_laptop dell_wmi 12682 0 microcode 23449 0 sparse_keymap 13527 1 dell_wmi vhost_net 33860 0 tun 22939 1 vhost_net macvtap 18241 1 vhost_net macvlan 18732 1 macvtap ecryptfs 99840 0 encrypted_keys 18503 1 ecryptfs kvm_intel 132720 0 kvm 431794 1 kvm_intel trusted 21714 1 encrypted_keys tpm 25830 1 trusted tpm_bios 18689 1 tpm uinput 17615 0 dm_crypt 22845 1 sdhci_pci 18661 0 crc32c_intel 12902 0 firewire_ohci 40402 0 sdhci 37836 1 sdhci_pci firewire_core 62461 1 firewire_ohci mmc_core 106983 1 sdhci yenta_socket 41228 0 crc_itu_t 12614 1 firewire_core wmi 18698 1 dell_wmi video 18992 0 Does this reproduce without nvidia or virtualbox modules loaded? Hello Justin, I just created a new file unload_modules with the SUSPEND_MODULES="whci_hcd" into the /etc/pm/config.d/ directory, by the way the directory was empty, is this normal?, this is a clean install. The problem was not resolved, actually right now the problems appears at start up, I took a picture of the screen in case that you do it. I do not remember how to prevent modules to be loaded, can you please send me the command. Best, Manuel Hello Justin, I still clueless about the process of permanently disabling the kernel modules. the modprobe -r only works temporarily, the modules become active after reboot. I also added the vbox modules into the unload_modules file, the modules become active after reboot, adding the modules to the blacklist.conf did not prevent the kernel from loading the modules after reboot. I search for information on how to do it but looks like the process is different for fedora 18. I will be happy to try your suggestions but I will need additional information on how to do it. Best, Manuel For virtualbox, there is probably something in /etc/modprode.d that needs to be removed. For nvidia it is probably different as X has to be configured to use the nvidia driver too. I haven't actually installed either of these, but looking at the fusion nvidia how-to: nvidia-config-display disable rm /etc/X11/xorg.conf *** Bug 924545 has been marked as a duplicate of this bug. *** Josh, kernel-3.7.9-104.fc17 kernel-3.7.10-101.fc17 Both of these two kernels boot without any issues. kernel-3.8.2-105.fc17 This kernel does not boot and produces repeating message: whci-hcd 0000:0d:00.0-1: wusbhc_rh_suspend (ffff88020bad4000 [ffff88020bad4000]) UNIMPLEMENTED I am working on getting you dmesg, lsmod and lspci output. Josh, I can't boot kernel-3.8.2-105.fc17 without: $ cat /etc/modprobe.d/blacklist.conf | grep whci # kernel 3.8.X whci_hcd blacklist whci_hcd Thank you Allen, I blacklisted the whci_hcd and now I can boot from kernel-3.8.X. Best, Manuel Created attachment 714841 [details]
kernel-3.8.2-105.fc17.x86_64 dmesg output
Created attachment 714842 [details]
kernel-3.8.2-105.fc17.x86_64 lsmod output
Created attachment 714843 [details]
kernel-3.8.2-105.fc17.x86_64 sudo lspci -nnvv output
After updating my Dell Precision M6500 laptop with kernel 3.8.3-103.fc17.x86_64 my /tmp directory filed with this same message: Mar 25 08:42:42 jackson kernel: [ 21.230188] whci-hcd 0000:0d:00.0-1: wusbhc_rh_suspend (ffff8801142c5800 [ffff8801142c5800]) UNIMPLEMENTED Rebooting with 3.7.9-104.fc17.x86_64 and deleting a huge /var/log/messages allowed me to work again. Hello Michael, I think that we, the owners of the M6500, have the same problem, if you add" blacklist whci_hcd to the /etc/modprobe.d/blacklist.conf you should be able to boot from the new kernels. I hope this is a temporary fix. I do not need the whci_hcd but some other users probably need the module. Best, Manuel Hi Manuel, In my haste to get back to work, I thought that black-listing whci_hcd might affect my WiFi wireless, and did not try that solution. However, I added the black-list entry and am now running 3.8.3-103.fc17.x86_64 with WiFi and without the bug. thank you, mike Hello Mike, I also do not have a clear idea about the module, my wireless also works fine after blacklisting the module, from what I got from google is "Wireless USB Host Controller Interface (WHCI) driver (EXPERIMENTAL)" but I am still do not know what the do module is for. Best, Manuel Justin, The device is a Dell Wireless 420 Ultra Wide Band minicard. It's a little card that connects into a SODRAM style "socket". It's also the bluetooth card in my laptop. The kernel module that's causing the issue is the wireless USB root hub support. The behaviour is different between 3.7.X/3.6.X and 3.8.X kernels. I documented seeing this behaviour in previous kernels, but what's different is that in the older kernels the system still booted, you would only see the message a handful of times. Whereas in the 3.8.X kernel, the system appears to not boot. Whatever is using the value of whc_hc_driver.bus_suspend isn't giving up, where previously it would. I suspect that if the system had enough time and space for the log messages, the system would boot to a login prompt. But even then, I would guess you wouldn't be able to use it. In kernel 3.8.X the log message appears for every 100ms or faster[2]. I think this same (or similiar) issue would exist if you used a USB (wireless root hub) style dongle. Then the hwa_hc.ko module would load and it also has wusbhc_rh_suspend defined on it's hwahc_hc_driver.bus_suspend struct member[3]. I also think that it's not the kernel modules per se, it's whatever calls or manages them. Maybe it's the USB subsystem? Have the wusbcore kernel modules been maintained? A Dell Precision M6500 Ubuntu user is also reporting the same issue: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1086961 1) kernel modules that have wusbhc_rh_suspend as an undefined symbol; there are two: allen@fedora-zero /lib/modules/3.8.2-105.fc17.x86_64 $ sudo find . -name '*.ko' -exec sudo nm --print-file-name '{}' + | grep wusbhc_rh_suspend ./kernel/drivers/usb/host/hwa-hc.ko: U wusbhc_rh_suspend ./kernel/drivers/usb/host/whci/whci-hcd.ko: U wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:0000000000000099 r __kstrtab_wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:00000000000001f0 r __ksymtab_wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:0000000000002b70 T wusbhc_rh_suspend allen@fedora-zero /lib/modules/3.7.10-101.fc17.x86_64 $ sudo find . -name '*.ko' -exec sudo nm --print-file-name '{}' + | grep wusbhc_rh_suspend ./kernel/drivers/usb/host/hwa-hc.ko: U wusbhc_rh_suspend ./kernel/drivers/usb/host/whci/whci-hcd.ko: U wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:0000000000000099 r __kstrtab_wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:00000000000001f0 r __ksymtab_wusbhc_rh_suspend ./kernel/drivers/usb/wusbcore/wusbcore.ko:0000000000002b70 T wusbhc_rh_suspend 2) sudo modprobe whci_hcd;sleep 0.1;sudo rmmod whci_hcd; sudo rmmod wusbcore; produces 0 messages on kernel-3.7.10-101 and 96 messages on kernel-3.8.2-105 and 64 messages on kernel-3.8.3-103 and 114 messages on kernel-3.8.4-101. 3) kernel 3.8 drivers/usb/host/hwa-hc.c file 575 static struct hc_driver hwahc_hc_driver = { 576 .description = "hwa-hcd", 577 .product_desc = "Wireless USB HWA host controller", 578 .hcd_priv_size = sizeof(struct hwahc) - sizeof(struct usb_hcd), 579 .irq = NULL, /* FIXME */ 580 .flags = HCD_USB2, /* FIXME */ 581 .reset = hwahc_op_reset, 582 .start = hwahc_op_start, 583 .stop = hwahc_op_stop, 584 .get_frame_number = hwahc_op_get_frame_number, 585 .urb_enqueue = hwahc_op_urb_enqueue, 586 .urb_dequeue = hwahc_op_urb_dequeue, 587 .endpoint_disable = hwahc_op_endpoint_disable, 588 589 .hub_status_data = wusbhc_rh_status_data, 590 .hub_control = wusbhc_rh_control, 591 .bus_suspend = wusbhc_rh_suspend, 592 .bus_resume = wusbhc_rh_resume, 593 .start_port_reset = wusbhc_rh_start_port_reset, 594 }; Created attachment 716131 [details]
sudo modprobe whci_hcd; sleep-0.1; sudo rmmod whci_hcd; sudo rmmod wusbcore; output for kernel 3.7.10-101
Created attachment 716132 [details]
sudo modprobe whci_hcd; sleep-0.1; sudo rmmod-whci_hcd; sudo rmmod wusbcore; output for kernel 3.8.2-105
Even though this issue is for Fedora 18, this issue will happen on any 3.8.X kernel running on a Dell Precision M6500. I am still on Fedora 17. We've disabled the driver in both F18 and F17. This will be included in the next submitted update of each. Josh, I don't think I have ever seen a bluetooth/UWB USB root hub or any gadget that acted like one. It looks like they are orphaned: CERTIFIED WIRELESS USB (WUSB) SUBSYSTEM: L: linux-usb.org S: Orphan F: Documentation/usb/WUSB-Design-overview.txt F: Documentation/usb/wusb-cbaf F: drivers/usb/host/hwa-hc.c F: drivers/usb/host/whci/ F: drivers/usb/wusbcore/ F: include/linux/usb/wusb* Are these modules supposed to work? Did you disable just USB_WHCI_HCD? Or did you do disable all of wusbcore (USB_WUSB/USB_WUSB_CBAF/UWB_WHCI, etc)? kernel-3.8.5-201.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.8.5-201.fc18 Package kernel-3.8.5-201.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.8.5-201.fc18' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-4645/kernel-3.8.5-201.fc18 then log in and leave karma (feedback). kernel-3.8.5-201.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. |