Description of problem: Experienced this bug twice in the last week on two separate Dell R720's running Fedora 20 w/ 3.13.5-200.fc20.x86_64: Mar 25 18:16:01 mdct-04pi kernel: BUG: scheduling while atomic: python/11974/0x00010000 Mar 25 18:16:01 mdct-04pi kernel: Modules linked in: binfmt_misc ipmi_devintf iTCO_wdt iTCO_vendor_support gpio_ich dcdbas x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel microcode sb_edac edac_core ipmi_si ipmi_ Mar 25 18:16:01 mdct-04pi kernel: CPU: 0 PID: 11974 Comm: python Not tainted 3.13.5-200.fc20.x86_64 #1 Mar 25 18:16:01 mdct-04pi kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.1.3 11/20/2013 Mar 25 18:16:01 mdct-04pi kernel: ffff88080fa14580 ffff88080fa03c08 ffffffff81686fdc ffff881002fbcda0 Mar 25 18:16:01 mdct-04pi kernel: ffff88080fa03c18 ffffffff81683452 ffff88080fa03c78 ffffffff8168a980 Mar 25 18:16:01 mdct-04pi kernel: ffff881002fbcda0 ffff880a585d5fd8 0000000000014580 0000000000014580 Mar 25 18:16:01 mdct-04pi kernel: Call Trace: Mar 25 18:16:01 mdct-04pi kernel: <IRQ> [<ffffffff81686fdc>] dump_stack+0x45/0x56 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff81683452>] __schedule_bug+0x4c/0x5a Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8168a980>] __schedule+0x730/0x740 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8168a9b9>] schedule+0x29/0x70 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8168d1f5>] rwsem_down_read_failed+0xe5/0x120 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813216e4>] call_rwsem_down_read_failed+0x14/0x30 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8168cae0>] ? down_read+0x20/0x30 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813dc800>] n_tty_receive_buf2+0x40/0xd0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff810a2668>] ? __enqueue_entity+0x78/0x80 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813df4a5>] flush_to_ldisc+0xd5/0x120 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813df539>] tty_flip_buffer_push+0x49/0x50 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813f9ce4>] serial8250_rx_chars+0xc4/0x1f0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813f9e74>] serial8250_handle_irq.part.14+0x64/0xa0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813f9ef7>] serial8250_default_handle_irq+0x27/0x30 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813f8eeb>] serial8250_interrupt+0x5b/0xe0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff810c10fe>] handle_irq_event_percpu+0x3e/0x1d0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff810c12c7>] handle_irq_event+0x37/0x60 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff810c3c6f>] handle_edge_irq+0x6f/0x120 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8101560f>] handle_irq+0xbf/0x150 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff81072927>] ? irq_enter+0x47/0x80 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff816981cd>] do_IRQ+0x4d/0xc0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8168dead>] common_interrupt+0x6d/0x6d Mar 25 18:16:01 mdct-04pi kernel: <EOI> [<ffffffff813d9050>] ? n_tty_chars_in_buffer+0xa0/0xa0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff810b29c6>] ? up_write+0x6/0x20 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813d90a8>] ? n_tty_ioctl+0x58/0x110 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff813d8314>] tty_ioctl+0x6d4/0xb60 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff8109e370>] ? wake_up_state+0x20/0x20 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff811cb6d8>] do_vfs_ioctl+0x2d8/0x4a0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff811b898e>] ? vfs_read+0xee/0x160 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff811cb921>] SyS_ioctl+0x81/0xa0 Mar 25 18:16:01 mdct-04pi kernel: [<ffffffff81695fa9>] system_call_fastpath+0x16/0x1b Version-Release number of selected component (if applicable): Fedora 20 kernel-3.13.5-200.fc20.x86_64 python-2.7.5-11.fc20.x86_64 How reproducible: Executing a python program that takes a request via XML-RPC, polls equipment connected via RS232 port, responds to the original request. Steps to Reproduce: 1. Execute the python program 2. Wait a long time (days/weeks) for it to deadlock 3. Actual results: Python program deadlocks and the kernel bug stackdump is written to the journal. Expected results: No deadlock due to kernel scheduling Additional info: We have updated one of the servers to 3.13.6-200.fc20.x86_64 and will update the other in a day or two. No presentation of the bug yet, but I expect that it will require days/weeks to determine if the problem has resolved.
BTW, this may be related to 990955, but I thought it would be best to submit a new bug.
This bug is fixed on kernel-3.13.5-202.fc20 (bug 1065087) *** This bug has been marked as a duplicate of bug 1065087 ***