Bug 1119361
| Summary: | r592 IRQ: DMA error | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Neal Becker <ndbecker2> |
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 20 | CC: | gansalmon, gordon, itamar, jonathan, kernel-maint, madhu.chinakonda, maximlevitsky, mchehab, nemesis |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-04-28 18:25:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Neal Becker
2014-07-14 15:28:56 UTC
I am seeing this bug on OpenSUSE with kernel-desktop versions 3.15.6 and 3.16.0. This bug does not appear to be present in the OpenSUSE stock 3.11.6 and 3.11.10 kernels. The log files on my laptop, /var/log/messages and /var/log/warn, will grow HUGE with the following repeated over and over along with some extraneous related log entries: 2014-08-15T12:18:41.396981-04:00 localhost kernel: [17310.691176] r592: IRQ: card added 2014-08-15T12:18:41.396983-04:00 localhost kernel: [17310.691178] r592: IRQ: DMA error . <snip> . 2014-08-15T12:18:41.397001-04:00 localhost kernel: [17311.223846] r592: IRQ: card added 2014-08-15T12:18:41.397004-04:00 localhost kernel: [17311.223847] r592: IRQ: DMA error On any given day, there will be tens of thousands of these entries such that both /var/log/messages and /var/log/warn will exceed 3GB within a day. I believe that this bug is generated due to the presence of a Ricoh R5C592 Memory Stick Adapter on my laptop and which has never been properly recognized by the stock linux kernels. Apparently, the kernel developers keep trying to make it work but have now introduced a "bigger" problem so to speak since it is now causing all these extraneous error messages to be generated in the logs. I have since reverted back to the earlier kernels that don't have this problem. FYI, Gordon Very sorry I haven't noticed this :-(
Both drivers (r592 and r852) landed long ago, much before 3.12.
I am running here 3.12-rc1 so my first step would be to see if I can reproduce this.
Sorry, I was swamped with studies, but its over now, so I will handle this properly.
could you post your lspci?
I assume you didn't try an xD or Memstick card?
could you post your /proc/interrupts?
Best regards,
Maxim Levitsky
spci
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:01.0 PCI bridge: Intel Corporation Mobile PM965/GM965/GL960 PCI Express Root Port (rev 0c)
00:1a.0 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03)
00:1a.7 USB controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 03)
00:1d.0 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03)
00:1d.2 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 03)
00:1d.7 USB controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3)
00:1f.0 ISA bridge: Intel Corporation 82801HM (ICH8M) LPC Interface Controller (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) SATA Controller [AHCI mode] (rev 03)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03)
01:00.0 VGA compatible controller: NVIDIA Corporation G86M [GeForce 8600M GS] (rev a1)
02:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection (rev 61)
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 01)
09:09.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 05)
09:09.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22)
09:09.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 12)
09:09.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12)
at /proc/interrupts
CPU0 CPU1
0: 46644053 0 IO-APIC-edge timer
1: 101308 0 IO-APIC-edge i8042
8: 1 0 IO-APIC-edge rtc0
9: 33800 0 IO-APIC-fasteoi acpi
12: 18030 0 IO-APIC-edge i8042
14: 206806 0 IO-APIC-edge ata_piix
15: 0 0 IO-APIC-edge ata_piix
16: 0 0 IO-APIC-fasteoi uhci_hcd:usb3
18: 784 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb7
19: 0 0 IO-APIC-fasteoi uhci_hcd:usb6
20: 14 0 IO-APIC-fasteoi firewire_ohci
21: 815080 0 IO-APIC-fasteoi uhci_hcd:usb4, r592, mmc0
23: 14 0 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb5
44: 7114760 933804 PCI-MSI-edge ahci
45: 64570 33985 PCI-MSI-edge nouveau
46: 3923781 1274141 PCI-MSI-edge p3p1
47: 2215894 13051590 PCI-MSI-edge iwl4965
48: 350 0 PCI-MSI-edge snd_hda_intel
NMI: 3885 21611 Non-maskable interrupts
LOC: 25062896 48054361 Local timer interrupts
SPU: 0 0 Spurious interrupts
PMI: 3885 21611 Performance monitoring interrupts
IWI: 1483237 1342822 IRQ work interrupts
RTR: 0 0 APIC ICR read retries
RES: 5794165 6753835 Rescheduling interrupts
CAL: 16384 24149 Function call interrupts
TLB: 1069664 964191 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
MCE: 0 0 Machine check exceptions
MCP: 712 709 Machine check polls
THR: 0 0 Hypervisor callback interrupts
ERR: 0
MIS: 0
I did plug an sd card in the other day, AFAICT it worked fine.
Problem is still there:
ls -l /var/log/messages
-rw-r--r-- 1 root root 125589321 Aug 23 08:01 /var/log/messages
[nbecker@nbecker1 ~]$ tail /var/log/messages
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: card added
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: DMA error
Aug 23 08:01:47 nbecker1 kernel: [211701.788999] r592: IRQ: card added
Aug 23 08:01:47 nbecker1 kernel: [211701.789005] r592: IRQ: DMA error
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: card added
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: DMA error
Aug 23 08:01:47 nbecker1 kernel: [211701.829044] r592: IRQ: card added
Aug 23 08:01:47 nbecker1 kernel: [211701.829055] r592: IRQ: DMA error
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: card added
Aug 23 08:01:47 nbecker1 kernel: r592: IRQ: DMA error
OK, could you now find .config of your kernel and do
cat /boot/config-3.17.0-rc1+ | grep MMC_RICOH_MMC
# CONFIG_MMC_RICOH_MMC is not set
does it also say that this setting is not set?
(replace the /boot/config-3.17.0-rc1+ with your kernel config location)
I am pretty sure that this is the problem and its explains everything, I explain later.
I probably should send a patch to remove this option alltogher.
Best regards,
Maxim Levitsky
uname -a Linux nbecker1 3.15.10-200.fc20.x86_64 #1 SMP Thu Aug 14 15:39:24 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux /boot/config-3.15.10-200.fc20.x86_64 contains: CONFIG_MMC_RICOH_MMC=y Could you also post output of
sudo lspci -H1
After the error happened of course.
Best regards,
Maxim Levitsky
AFAICT this happens continuously as long as the machine is on sudo lspci -H1 [sudo] password for nbecker: 00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c) 00:01.0 PCI bridge: Intel Corporation Mobile PM965/GM965/GL960 PCI Express Root Port (rev 0c) 00:1a.0 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03) 00:1a.1 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03) 00:1a.7 USB controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03) 00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03) 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 03) 00:1d.0 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03) 00:1d.1 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03) 00:1d.2 USB controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 03) 00:1d.7 USB controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3) 00:1f.0 ISA bridge: Intel Corporation 82801HM (ICH8M) LPC Interface Controller (rev 03) 00:1f.1 IDE interface: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03) 00:1f.2 SATA controller: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) SATA Controller [AHCI mode] (rev 03) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03) 01:00.0 VGA compatible controller: NVIDIA Corporation G86M [GeForce 8600M GS] (rev a1) 02:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection (rev 61) 08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 01) 09:09.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 05) 09:09.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22) 09:09.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 12) 09:09.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12) I am pretty sure I know what is going on here.
r592 IRQ is shared here with USB so its interrupt handler is flooded with interrupts that don't belong to it.
It asks device for interrupt status, but somehow device reports that it needs serving an interrupt.
I suspect that device is somehow disabled by BIOS but still visible on PCI bus.
Another thing to note, which I suspected is the CONFIG_MMC_RICOH_MMC, which seems not to be the cause of the problem, but it can cause it.
The problem is that this card reader usually is represented by 5 PCI functions, of which 0th is the firewire, 1th is the SD card reader, 2nd is MMC, 3rd is Memstick, and lastly 4rd is the xD, and it works fine in this configuration after much effort by me and others.
This setting (CONFIG_MMC_RICOH_MMC) disables MMC function (and lets SD function take MMC cards)
This has nasty effect that now as in your case, the memstick and xD functions are number 2 and 3.
And in some cases, linux thinks so, but hardware don't, thats why I asked for lspci -H1 as this ask PCI directly what its there.
So in your case this 'disabler' does work, so its not the cause of the problem, but still I would recommend to try and disable it (need to recompile the kernel and do _cold_ boot to new kernel)
Another thing to try is to bring an real memstick card and try it, its interesting if error goes away when its inserted, that is maybe BIOS will reenable the reader then.
Best regards,
Maxim Levitsky
Any update? I haven't tried building a new kernel yet. Trying it now. I haven't built a kernel in a long time.
I grabbed the rpm source
yumdownloader --source kernel
Then in kernel.spec I did
%define buildid .local
I tried edit config-local, adding
CONFIG_MMC_RICOH_MMC=n
but then rpmbuild fails.
Without this, rpmbuild -bp kernel.spec will succeed, but doesn't do much:
...
# configuration written to .config
#
+ echo '# x86_64'
+ cat .config
+ find . '(' -name '*.orig' -o -name '*~' ')' -exec rm -f '{}' ';'
+ find . -name .gitignore -exec rm -f '{}' ';'
+ cd ..
+ exit 0
Compilation finished at Sat Sep 6 13:58:33
All it seems to do is some config, but not compile anything.
What do I need to do?
I built/installed a kernel Linux nbecker1 3.16.2-200.local.fc20.x86_64 #1 SMP Tue Sep 16 13:12:11 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux with # CONFIG_MMC_RICOH_MMC is not set This does NOT fix the problem. The system is essentially unusable unless I kill rsyslogd. I may have gotten to the bottom of this. I cleaned out all the old journal files and rebooted. Now I don't have rsyslogd using 100% cpu. The issue with Jul 14 07:11:00 nbecker1 kernel: r592: IRQ: DMA error Jul 14 07:11:00 nbecker1 kernel: r592: IRQ: card added is probably gone. I'm guessing that the reason I kept getting rsyslogd running 100% cpu is processing the OLD journal files. I tried rebooting into the stock 3.16.2-200.fc20.x86_64 as well as my .local version, and I don't see the problem. Please note that setting CONFIG_MMC_RICOH_MMC changes survives reboots.
You need full shutdown (on laptop preferably with battery removed) to be sure that it was reset.
I would glad if you test new and old kernel this way.
Also the issue has good chance to appear after hibernating/suspend, so if you could, please test it.
Thanks a lot!!
Best regards,
Maxim Levitsky
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs. Fedora 20 has now been rebased to 3.17.2-200.fc20. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21. If you experience different issues, please open a new bug report for those. Funny you should mention, just this morning I updated to 3.17.2-200.fc20 and the problem is _still there_. I did rebuild from srpm, using: CONFIG_MMC_RICOH_MMC=n in config-local, and the problem appears to be gone. So I believe CONFIG_MMC_RICOH_MMC=n is needed. After updating f20->f21, this problem seemed to go away. Now on kernel-3.17.7-300.fc21.x86_64, it's back with a vengence: Jan 6 14:39:57 nbecker1 kernel: [57260.180986] r592: IRQ: card added Jan 6 14:39:57 nbecker1 kernel: [57260.180998] r592: IRQ: DMA error Jan 6 14:39:57 nbecker1 kernel: r592: IRQ: card added Jan 6 14:39:57 nbecker1 kernel: r592: IRQ: DMA error Jan 6 14:39:57 nbecker1 kernel: [57260.188919] r592: IRQ: card added Jan 6 14:39:57 nbecker1 kernel: [57260.188927] r592: IRQ: DMA error Jan 6 14:39:57 nbecker1 kernel: r592: IRQ: card added (why isn't logging rate limiting???) Guess back to custom kernel *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs. Fedora 20 has now been rebased to 3.18.7-100.fc20. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21. If you experience different issues, please open a new bug report for those. *********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously. Still encountering this in kernel 4.4 when using the SD card. I can do some tests if necessary. |