Bug 1514134
Summary: | ACR122 contactless reader fails to work: pn533_usb: transfer buffer not dma capable | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Alex <fedora> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 27 | CC: | airlied, bskeggs, ewk, extras-qa, fedora, gansalmon, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, madhu.chinakonda, mchehab, mjg59, opensource, peter.schwindt, startxgo, steved | ||||
Target Milestone: | --- | Flags: | jforbes:
needinfo?
|
||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1436866 | Environment: | |||||
Last Closed: | 2018-08-29 15:17:41 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Alex
2017-11-16 17:22:51 UTC
We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. The kernel moves very fast so bugs may get fixed as part of a kernel update. Due to this, we are doing a mass bug update across all of the Fedora 26 kernel bugs. Fedora 26 has now been rebased to 4.15.4-200.fc26. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 27, and are still experiencing this issue, please change the version to Fedora 27. If you experience different issues, please open a new bug report for those. This issue affects kernels up to and including 4.15.4-300.fc27.x86_64 Created attachment 1402441 [details]
[PATCH] NFC: pn533: use kmalloc-ed memory for USB transfer buffers
Here is a patch which should fix the error about the buffer not being dma capable, not sure if this will also fix the actual -11 / -EAGAIN error though.
Here is a test-build of the latest Fedora kernel with the patch added: https://koji.fedoraproject.org/koji/taskinfo?taskID=25388155 Note this is still building atm (this takes a couple of hours), for instructions for testing a kernel directly from koji, see: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt Please let me know if this fixes the warning/backtrace, if it does then I will submit this upstream. Thank you for the Patch Hans. Sadly it does not work: [ 174.319983] usb 1-4: new full-speed USB device number 8 using xhci_hcd [ 174.919606] usb 1-4: New USB device found, idVendor=072f, idProduct=2200 [ 174.919616] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 174.919623] usb 1-4: Product: ACR122U PICC Interface [ 174.919629] usb 1-4: Manufacturer: ACS [ 175.003330] nfc: nfc_init: NFC Core ver 0.1 [ 175.003451] NET: Registered protocol family 39 [ 175.014639] ================================ [ 175.014640] WARNING: inconsistent lock state [ 175.014642] 4.16.0-0.rc3.git2.1.rhbz1514134.fc29.x86_64 #1 Not tainted [ 175.014643] -------------------------------- [ 175.014644] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 175.014646] systemd-logind/1497 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 175.014647] (fs_reclaim){?.+.}, at: [<000000009a6be840>] fs_reclaim_acquire.part.75+0x5/0x30 [ 175.014654] {HARDIRQ-ON-W} state was registered at: [ 175.014655] irq event stamp: 1633158 [ 175.014658] hardirqs last enabled at (1633157): [<000000000a724ee0>] __raw_spin_lock_init+0x1c/0x50 [ 175.014662] hardirqs last disabled at (1633158): [<0000000003684e0b>] interrupt_entry+0xca/0xe0 [ 175.014664] softirqs last enabled at (1632674): [<000000006caec4fc>] __do_softirq+0x38c/0x4f7 [ 175.014666] softirqs last disabled at (1632565): [<00000000bfa17845>] irq_exit+0x101/0x110 [ 175.014667] other info that might help us debug this: [ 175.014669] Possible unsafe locking scenario: [ 175.014670] CPU0 [ 175.014672] ---- [ 175.014673] lock(fs_reclaim); [ 175.014675] <Interrupt> [ 175.014676] lock(fs_reclaim); [ 175.014678] *** DEADLOCK *** [ 175.014681] 1 lock held by systemd-logind/1497: [ 175.014682] #0: (kn->count#183){.+.+}, at: [<0000000093f616c1>] kernfs_fop_open+0x3a/0x410 [ 175.014688] stack backtrace: [ 175.014690] CPU: 2 PID: 1497 Comm: systemd-logind Not tainted 4.16.0-0.rc3.git2.1.rhbz1514134.fc29.x86_64 #1 [ 175.014692] Hardware name: LENOVO 20HGS22D1A/20HGS22D1A, BIOS N1WET41W (1.20 ) 10/17/2017 [ 175.014693] Call Trace: [ 175.014695] <IRQ> [ 175.014699] dump_stack+0x85/0xbf [ 175.014703] print_usage_bug.cold.59+0x1a8/0x1e2 [ 175.014706] print_shortest_lock_dependencies+0x40/0x40 [ 175.014709] ? mark_lock+0x51a/0x620 [ 175.014712] ? check_usage_backwards+0x130/0x130 [ 175.014715] ? __lock_acquire+0x776/0x16d0 [ 175.014718] ? __lock_acquire+0x29a/0x16d0 [ 175.014721] ? native_sched_clock+0x3e/0xa0 [ 175.014725] ? lock_acquire+0x9e/0x1b0 [ 175.014727] ? fs_reclaim_acquire.part.75+0x5/0x30 [ 175.014731] ? fs_reclaim_acquire.part.75+0x29/0x30 [ 175.014733] ? fs_reclaim_acquire.part.75+0x5/0x30 [ 175.014736] ? kmem_cache_alloc_node+0x34/0x330 [ 175.014738] ? __alloc_skb+0x57/0x1e0 [ 175.014740] ? __alloc_skb+0x57/0x1e0 [ 175.014743] ? __usb_hcd_giveback_urb+0x60/0x160 [ 175.014748] ? pn533_recv_response+0x4a/0xa0 [pn533_usb] [ 175.014751] ? __usb_hcd_giveback_urb+0x88/0x160 [ 175.014754] ? xhci_giveback_urb_in_irq.isra.38+0x87/0x1a0 [ 175.014756] ? inc_deq+0x15c/0x1c0 [ 175.014759] ? xhci_td_cleanup+0xfb/0x160 [ 175.014762] ? xhci_irq+0xaa0/0x2110 [ 175.014765] ? __lock_is_held+0x5a/0xa0 [ 175.014769] ? __handle_irq_event_percpu+0x46/0x320 [ 175.014772] ? handle_irq_event_percpu+0x30/0x80 [ 175.014775] ? handle_irq_event+0x34/0x51 [ 175.014778] ? handle_edge_irq+0x7b/0x190 [ 175.014780] ? handle_irq+0xbf/0x100 [ 175.014782] ? do_IRQ+0x53/0x110 [ 175.014785] ? common_interrupt+0xf/0xf [ 175.014786] </IRQ> [ 175.014789] ? kernfs_fop_open+0x1ee/0x410 [ 175.014791] ? kernfs_fop_open+0x1ee/0x410 [ 175.014794] ? lock_is_held_type+0x46/0x60 [ 175.014798] ? ___might_sleep+0x116/0x180 [ 175.014801] ? __mutex_lock+0x56/0xa10 [ 175.014803] ? mark_held_locks+0x57/0x80 [ 175.014805] ? __raw_spin_lock_init+0x1c/0x50 [ 175.014808] ? __lockdep_init_map+0x45/0x170 [ 175.014811] ? kernfs_fop_open+0x1ee/0x410 [ 175.014813] ? kernfs_fop_open+0x1ee/0x410 [ 175.014816] ? kernfs_fop_release+0x90/0x90 [ 175.014819] ? do_dentry_open+0x1c4/0x310 [ 175.014822] ? path_openat+0x512/0xfb0 [ 175.014826] ? do_filp_open+0x93/0x100 [ 175.014830] ? _raw_spin_unlock+0x24/0x30 [ 175.014833] ? do_sys_open+0x18a/0x210 [ 175.014837] ? do_syscall_64+0x79/0x210 [ 175.014840] ? entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 175.014844] BUG: sleeping function called from invalid context at mm/slab.h:420 [ 175.014846] in_atomic(): 1, irqs_disabled(): 1, pid: 1497, name: systemd-logind [ 175.014849] INFO: lockdep is turned off. [ 175.014851] irq event stamp: 1633158 [ 175.014854] hardirqs last enabled at (1633157): [<000000000a724ee0>] __raw_spin_lock_init+0x1c/0x50 [ 175.014857] hardirqs last disabled at (1633158): [<0000000003684e0b>] interrupt_entry+0xca/0xe0 [ 175.014861] softirqs last enabled at (1632674): [<000000006caec4fc>] __do_softirq+0x38c/0x4f7 [ 175.014864] softirqs last disabled at (1632565): [<00000000bfa17845>] irq_exit+0x101/0x110 [ 175.014867] CPU: 2 PID: 1497 Comm: systemd-logind Not tainted 4.16.0-0.rc3.git2.1.rhbz1514134.fc29.x86_64 #1 [ 175.014869] Hardware name: LENOVO 20HGS22D1A/20HGS22D1A, BIOS N1WET41W (1.20 ) 10/17/2017 [ 175.014871] Call Trace: [ 175.014873] <IRQ> [ 175.014876] dump_stack+0x85/0xbf [ 175.014880] ___might_sleep.cold.72+0xac/0xbc [ 175.014884] ? kmem_cache_alloc_node+0x26e/0x330 [ 175.014887] ? __alloc_skb+0x57/0x1e0 [ 175.014890] ? __alloc_skb+0x57/0x1e0 [ 175.014893] ? __usb_hcd_giveback_urb+0x60/0x160 [ 175.014897] ? pn533_recv_response+0x4a/0xa0 [pn533_usb] [ 175.014900] ? __usb_hcd_giveback_urb+0x88/0x160 [ 175.014904] ? xhci_giveback_urb_in_irq.isra.38+0x87/0x1a0 [ 175.014907] ? inc_deq+0x15c/0x1c0 [ 175.014910] ? xhci_td_cleanup+0xfb/0x160 [ 175.014914] ? xhci_irq+0xaa0/0x2110 [ 175.014918] ? __lock_is_held+0x5a/0xa0 [ 175.014923] ? __handle_irq_event_percpu+0x46/0x320 [ 175.014927] ? handle_irq_event_percpu+0x30/0x80 [ 175.014930] ? handle_irq_event+0x34/0x51 [ 175.014934] ? handle_edge_irq+0x7b/0x190 [ 175.014937] ? handle_irq+0xbf/0x100 [ 175.014940] ? do_IRQ+0x53/0x110 [ 175.014943] ? common_interrupt+0xf/0xf [ 175.014945] </IRQ> [ 175.014949] ? kernfs_fop_open+0x1ee/0x410 [ 175.014952] ? kernfs_fop_open+0x1ee/0x410 [ 175.014956] ? lock_is_held_type+0x46/0x60 [ 175.014960] ? ___might_sleep+0x116/0x180 [ 175.014964] ? __mutex_lock+0x56/0xa10 [ 175.014967] ? mark_held_locks+0x57/0x80 [ 175.014970] ? __raw_spin_lock_init+0x1c/0x50 [ 175.014974] ? __lockdep_init_map+0x45/0x170 [ 175.014979] ? kernfs_fop_open+0x1ee/0x410 [ 175.014981] ? kernfs_fop_open+0x1ee/0x410 [ 175.014985] ? kernfs_fop_release+0x90/0x90 [ 175.014988] ? do_dentry_open+0x1c4/0x310 [ 175.014992] ? path_openat+0x512/0xfb0 [ 175.014997] ? do_filp_open+0x93/0x100 [ 175.015004] ? _raw_spin_unlock+0x24/0x30 [ 175.015008] ? do_sys_open+0x18a/0x210 [ 175.015013] ? do_syscall_64+0x79/0x210 [ 175.015017] ? entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 175.017242] usb 1-4: NFC: NXP PN532 firmware ver 1.4 now attached [ 175.025360] usbcore: registered new interface driver pn533_usb Ah, but we are making progress. We are getting much further then before, the driver now causes the kernel to complain about another driver bug then the first one I fixed, but it does register the NFC device. Fixing the second warning / issue is quite doable too, here is a new scratch build with the original fix + a second fix for the new issue: https://koji.fedoraproject.org/koji/taskinfo?taskID=25498501 As before this is still building atm (this takes a couple of hours), for instructions for testing a kernel directly from koji, see: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt It seems that the backtraces are gone but it also seems to me that the reader is not working anymore. I will do further tests and report back. (In reply to Till Maas from comment #7) > It seems that the backtraces are gone but it also seems to me that the > reader is not working anymore. I will do further tests and report back. Yes, I tested again. The reader works with an older kernel but not with the patched kernel. The backtrace is gone with the newer kernel. The older kernel still shows the backtrace but the reader works nonetheless. Hi, (In reply to Till Maas from comment #8) > (In reply to Till Maas from comment #7) > > It seems that the backtraces are gone but it also seems to me that the > > reader is not working anymore. I will do further tests and report back. > > Yes, I tested again. The reader works with an older kernel but not with the > patched kernel. The backtrace is gone with the newer kernel. The older > kernel still shows the backtrace but the reader works nonetheless. What is the version of the older kernel? Regards, Hans (In reply to Hans de Goede from comment #9) > What is the version of the older kernel? vmlinuz-4.15.7-300.fc27.x86_64 (In reply to Till Maas from comment #10) > vmlinuz-4.15.7-300.fc27.x86_64 Hmm, that is not that old. I can backport my fixes to that kernel and see if there are any unrelated changes in 4.16 which cause things to break there. But before I spend time on that, in your original bug report there were 2 issues really: 1) The "WARNING:" + backtrace 2) These errors, which are unrelated to the warning: [ 62.042804] usb 1-12: NFC: Reader power on cmd error -11 [ 62.042805] pn533_usb 1-12:1.0: NFC: Couldn't poweron the reader (error -11) [ 62.042808] pn533_usb: probe of 1-12:1.0 failed with error -11 With my first test kernel not only 1) got replaced by another "WARNING:" + backtrace, but 2) completely went away, which is somewhat weird. So maybe 2) is an intermittent error which only happens sometime. Can you retry with my second test kernel (which fixes both warnings) and: a) Check if it fails all the time, not just sometime b) Check for errors like those from 2) above in dmesg (In reply to Hans de Goede from comment #11) > (In reply to Till Maas from comment #10) > > vmlinuz-4.15.7-300.fc27.x86_64 > > Hmm, that is not that old. I can backport my fixes to that kernel and see if > there are any unrelated changes in 4.16 which cause things to break there. > > But before I spend time on that, in your original bug report there were 2 > issues really: > > 1) The "WARNING:" + backtrace > > 2) These errors, which are unrelated to the warning: > [ 62.042804] usb 1-12: NFC: Reader power on cmd error -11 > [ 62.042805] pn533_usb 1-12:1.0: NFC: Couldn't poweron the reader (error > -11) > [ 62.042808] pn533_usb: probe of 1-12:1.0 failed with error -11 > > With my first test kernel not only 1) got replaced by another "WARNING:" + > backtrace, but 2) completely went away, which is somewhat weird. According to my old comment, kernel-4.10.15-200.fc25 already allowed me to use the reader again but at least the WARNING/backtrace kept happening. I am not sure about 2). > So maybe 2) is an intermittent error which only happens sometime. Can you > retry with my second test kernel (which fixes both warnings) and: > > a) Check if it fails all the time, not just sometime I tried several times, it failed. The reader's LED did not properly light up, even. > b) Check for errors like those from 2) above in dmesg I will do this later. I have a different datapoint that I could obtain easier, with 4.14.11-200.fc26.x86_64 I get this dmesg output: [ +1.321390] usb 4-3: new full-speed USB device number 15 using ohci-pci [ +1.146027] usb 4-3: New USB device found, idVendor=072f, idProduct=2200 [ +0.000004] usb 4-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ +0.000001] usb 4-3: Product: ACR122U PICC Interface [ +0.000002] usb 4-3: Manufacturer: ACS [ +0.002452] usb 4-3: NFC: Reader power on cmd error -11 [ +0.000007] pn533_usb 4-3:1.0: NFC: Couldn't poweron the reader (error -11) [ +0.000008] pn533_usb: probe of 4-3:1.0 failed with error -11 Currently I have to restart pcscd sometimes but then the reader works. Did not see the backtrace there. I tested again with the second debug kernel. The reader does not work. Here are the details that I found so far: Kernel: 4.16.0-0.rc3.git2.1.rhbz1514134.fc29.x86_64 dmesg: [ +7,784675] usb 1-1: new full-speed USB device number 9 using xhci_hcd [ +0,533594] usb 1-1: New USB device found, idVendor=072f, idProduct=2200 [ +0,000011] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ +0,000008] usb 1-1: Product: ACR122U PICC Interface [ +0,000007] usb 1-1: Manufacturer: ACS [ +0,053183] nfc: nfc_init: NFC Core ver 0.1 [ +0,000137] NET: Registered protocol family 39 [ +0,011706] usb 1-1: NFC: NXP PN532 firmware ver 1.4 now attached [ +0,011307] usbcore: registered new interface driver pn533_usb journalctl -u pcscd: Mär 14 11:04:22 systemd[1]: Started PC/SC Smart Card Daemon. Mär 14 11:04:22 pcscd[4143]: 00000000 ccid_usb.c:649:OpenUSBByName() Can't claim interface 1/9: LIBUSB_ERROR_BUSY Mär 14 11:04:22 pcscd[4143]: 00000188 ifdhandler.c:150:CreateChannelByNameOrChannel() failed Mär 14 11:04:22 pcscd[4143]: 00000025 readerfactory.c:1105:RFInitializeReader() Open Port 0x200000 Failed (usb:072f/2200:libudev:0:/dev/bus/usb/001/009) Mär 14 11:04:22 pcscd[4143]: 00000005 readerfactory.c:376:RFAddReader() ACS ACR122U PICC Interface init failed. Mär 14 11:04:22 pcscd[4143]: 00000083 hotplug_libudev.c:523:HPAddDevice() Failed adding USB device: ACS ACR122U PICC Interface selinux (permissive mode): type=AVC msg=audit(1521022312.109:338): avc: denied { module_request } for pid=4926 comm="systemd-rfkill" kmod=6E65746465762DE3C452F76C55 scontext=system_u:system_r:systemd_rfkill_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system permissive=1 Hi, Ah, so pcscd wants to talk directly to the usb-interface and the problem is that my fixes make the driver actually bind to it as its supposed to, so now the interface is busy. The kernel does allow unbinding drivers and libusb does have an interface for this, so the remaining problem seems to be a pcscd bug. Try the following: ls /sys/bus/usb/drivers/pn533_usb You should see a symlink in there which looks something like "1-1:1.0" but then possibly with different numbers *and* possibly also longer e.g. it might also look like "1-9.4.1:1.2". Now do: sudo sh -c "echo '1-1:1.0' > /sys/bus/usb/drivers/pn533_usb/unbind Replacing '1-1:1.0' with the device-identifier for the reader on your system and after that restart pcscd, that should allow it to bind to the interface. So it seems that by fixing the driver I actually broke things for you as now the driver is binding successfully and thus gets in the way of pcscd. If my theory on this is right, please file a new bug against pcscd for this. Please also file an upstream bug for this and add the following comment on both down and upstream bugs: "Recent versions of libusb have a new: libusb_set_auto_detach_kernel_driver() api which pcscd should call on the libusb_device_handle for the device before trying to claim interfaces, libusb will then automatically detech the kernel driver on claiming the interface (if necessary) and re-attach it on releasing the interface." Regards, Hans awesome, thank you. The new bug is #1555264 *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs. Fedora 27 has now been rebased to 4.17.7-100.fc27. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28. If you experience different issues, please open a new bug report for those. *********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously. |