Red Hat Bugzilla – Bug 504402
lirc_serial is crashing kernel
Last modified: 2009-07-22 17:58:35 EDT
Description of problem:
When using a remote with lirc_serial the kernel is crashing. This happens if you use irw twice. The first time it is working and the remote buttons are displayed. But after restarting irw the kernel crashes (on a tty you see a calltrace) and the machine freezes. There is no way to gather some information about the crash after the machine is freezed (dmesg and /var/log/messages don't get new entries when logged in via ssh).
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. modprobe lirc_serial
3. press a button on the remote, it works
4. stop irw and restart it again
5. press a button on the remote, kernel crashes
crash and sometimes freeze
Additional info: I'm loading lirc_serial this way via /etc/modprobe.conf/lirc.conf:
alias char-major-61 lirc_serial
options lirc_serial irq=4 io=0x3f8
install lirc_serial /bin/setserial /dev/ttyS0 uart none;\
/sbin/modprobe --ignore-install lirc_serial
This kerneloops is maybe related to this issue. It's the same crash but the kernel is tainted with the nvidia driver. For this bug report I removed the nvidia driver entirely but now the machine is hard freezing and I couldn't get more information (with the nvidia driver it only crashes the module but the machine was still responsible): http://www.kerneloops.org/submitresult.php?number=415193
Created attachment 346773 [details]
oops report from kerneloops.org
This bug is still present in my Fedora 11 installation as well. I'm using lirc to control my set-top box with an IR Blaster in a mythtv installation. The channel change script is successful exactly once per reboot. Any attempts to change the channel a second time with lirc results in a hard lock-up that requires a power cycle.
No error messages ever make it to any of the logs, so I'm unable to provide much detail at this point. I'm including my /etc/modprobe.d/lirc.conf file, but it's basically identical as the one above.
alias char-major-61-0 lirc_i2c
alias char-major-61-1 lirc_serial
options lirc_serial irq=4 io=0x3f8 softcarrier=1
install lirc_i2c /sbin/modprobe ivtv; /sbin/modprobe --ignore-install lirc_i2c
install lirc_serial setserial /dev/ttyS0 uart none; /sbin/modprobe --ignore-install lirc_serial
All I can hope is to get a digital photo of the screen dump the next time I see it. I'm not sure why, but dumping error messages to the screen is less consistent than it locking up.
Created attachment 348043 [details]
BUG: unable to handle kernel NULL pointer dereference at (null)
After a few attempts, my system survived long enough to send some error messages to the logs. I'm attaching them in case they're useful even though they look very similar to Sebastian's attachment.
static void lirc_buffer_clear(struct lirc_buffer *buf)
static inline void kfifo_reset(struct kfifo *fifo)
unsigned long flags;
looks like ir->buf->fifo->lock is NULL
Same problem with my (non-tainted) F11 - I'm loading lirc_serial and lirc_i2c in the same way.
Last section of dmesg :
e100: eth0 NIC Link is Up 100 Mbps Full Duplex
ADDRCONF(NETDEV_UP): eth0: link is not ready
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
lirc_dev: IR Remote Control driver registered, major 248
lirc_i2c: chip 0x10020 found @ 0x18 (Hauppauge IR)
lirc_dev: lirc_register_driver: sample_rate: 10
lirc_serial: auto-detected active high receiver
lirc_dev: lirc_register_driver: sample_rate: 0
eth0: no IPv6 routers present
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c0421b04>] __ticket_spin_lock+0x8/0x19
*pdpt = 0000000024c0f001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/virtual/lirc/lirc1/dev
Modules linked in: lirc_serial lirc_i2c lirc_dev sco bridge stp llc bnep l2cap bluetooth sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 jfs dm_multipath uinput ivtvfb tuner_simple tuner_types tda9887 tda8290 ppdev tuner msp3400 saa7127 saa7115 dcdbas ivtv cx2341x iTCO_wdt snd_intel8x0 serio_raw v4l2_common iTCO_vendor_support pcspkr snd_ac97_codec ac97_bus i2c_i801 e100 videodev mii v4l1_compat tveeprom snd_usb_audio snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd parport_pc soundcore parport joydev ata_generic pata_acpi nouveau drm i2c_algo_bit i2c_core [last unloaded: p4_clockmod]
Pid: 2088, comm: lircd Not tainted (184.108.40.206-167.fc11.i686.PAE #1) Dimension 8300
EIP: 0060:[<c0421b04>] EFLAGS: 00010046 CPU: 0
EIP is at __ticket_spin_lock+0x8/0x19
EAX: 00000000 EBX: 00000286 ECX: c07f8db3 EDX: 00000100
ESI: 00000000 EDI: 00000000 EBP: dfc0fe5c ESP: dfc0fe5c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process lircd (pid: 2088, ti=dfc0e000 task=e4edb280 task.ti=dfc0e000)
dfc0fe64 c0421bc7 dfc0fe74 c0716d5d e4d77b00 e4dcf400 dfc0fe84 f2d7d6c2
00000000 e4d77b88 dfc0fea4 c04ab191 dfc09000 e4f1eae0 e4f1eae0 dfc09000
e4f1eae0 edc0d480 dfc0fec0 c04a73e3 e4481000 00000000 e4481000 dfc09000
[<c0421bc7>] ? default_spin_lock_flags+0x8/0xd
[<c0716d5d>] ? _spin_lock_irqsave+0x30/0x37
[<f2d7d6c2>] ? lirc_dev_fop_open+0xb4/0x189 [lirc_dev]
[<c04ab191>] ? chrdev_open+0x11e/0x135
[<c04a73e3>] ? __dentry_open+0x116/0x1f9
[<c04a756e>] ? nameidata_to_filp+0x32/0x47
[<c04ab073>] ? chrdev_open+0x0/0x135
[<c04b1546>] ? do_filp_open+0x34c/0x5e9
[<c04ab802>] ? cp_new_stat64+0xe3/0xf5
[<c05687b1>] ? strncpy_from_user+0x38/0x54
[<c04b94c3>] ? alloc_fd+0xd0/0xdc
[<c04a71ea>] ? do_sys_open+0x47/0xbc
[<c04a72ab>] ? sys_open+0x23/0x2b
[<c040955e>] ? syscall_call+0x7/0xb
Code: 4f fd ff ff 5b eb 13 56 0f b7 d2 ff 75 08 89 d9 0f b6 c0 e8 6e fd ff ff 5a 59 8d 65 f8 5b 5e 5d c3 90 90 55 ba 00 01 00 00 89 e5 <3e> 66 0f c1 10 38 f2 74 06 f3 90 8a 10 eb f6 5d c3 55 89 c2 89
EIP: [<c0421b04>] __ticket_spin_lock+0x8/0x19 SS:ESP 0068:dfc0fe5c
---[ end trace c012546f4855adea ]---
Okay, finally got some time to poke at this... I was able to reproduce the problem, and I believe I've got it fixed. I can now run irw and press buttons w/o triggering an oops, kill irw, start it back up again, keep getting button presses, etc. I've pushed a build into koji which I'd appreciate folks testing to verify this is indeed fixed:
Whoops, forgot this was originally posted against F10, not F11... Here's the matching F10 build too:
I attempted to test the -206 build this evening and was unsuccessful. I didn't make it far enough into the boot to be able to get logs or a screenshot, so I wrote down what little I could. I'm not sure if what I wrote down is sufficient or if you need more info, or if my new bug is even related to this bug or one of the other changes between -191 and -206. I suspect it's something different because I don't think I even made it to lirc_serial in the short boot-time. At any rate, the kernel panicked three times consecutively and the EIP line showed:
The call trace showed:
I don't know if that information is useful, but if you can point me to what portion of the panic screen is most useful, I'd be happy to provide it.
It looks like the -209 build fixed my kernel panics and the bug fixes you added to -206 seem to be working beautifully on the lirc_serial module. I have a functional mythtv setup back, thanks. :)
Excellent to hear, Phil, thanks for the feedback!
I upgraded to F11 in the meantime. And with -211 all seems to be working fine for now. So I consider this bug as fixed.
Well, just to forgotten to say thanks for fixing this. :)
kernel-220.127.116.11-213.fc11 has been submitted as an update for Fedora 11.
kernel-18.104.22.168-213.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.