Bug 437498 - dlm_recv stuck in loop through lookup list
dlm_recv stuck in loop through lookup list
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
low Severity low
: rc
: ---
Assigned To: David Teigland
Red Hat Kernel QE team
Depends On:
  Show dependency treegraph
Reported: 2008-03-14 12:12 EDT by David Teigland
Modified: 2009-09-03 12:51 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-05-01 15:17:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description David Teigland 2008-03-14 12:12:41 EDT
Description of problem:

Deans new dlm stress test caused this, probably when it tried to exit.

dlm_recv at 100% cpu

SysRq : Show CPUs
 ffff810102b4ff48 0000000000000000 ffff81007372b970 ffffffff8019cc11
 0000000000000000 ffff810080051400 0000000000000058 ffffffff8019cc40
 ffffffff8005e2fc ffffffff80022d96 ffff81013dc19860 ffff8100740f2000
Call Trace:
 <IRQ>  [<ffffffff8019cc11>] showacpu+0x0/0x3b
 [<ffffffff8019cc40>] showacpu+0x2f/0x3b
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff80022d96>] smp_call_function_interrupt+0x57/0x75
 [<ffffffff8005dc22>] call_function_interrupt+0x66/0x6c
 <EOI>  [<ffffffff80062558>] __sched_text_start+0x148/0xaeb
 [<ffffffff8851a0b3>] :dlm:_request_lock+0x54/0x24c
 [<ffffffff8851a0b3>] :dlm:_request_lock+0x54/0x24c
 [<ffffffff8851a387>] :dlm:process_lookup_list+0x3b/0x58
 [<ffffffff8851b10e>] :dlm:_receive_message+0x384/0xb41
 [<ffffffff80063a5d>] mutex_lock+0xd/0x1d
 [<ffffffff8851b9c7>] :dlm:dlm_receive_buffer+0xf7/0x12b
 [<ffffffff8851efe4>] :dlm:dlm_process_incoming_buffer+0x100/0x138
 [<ffffffff8000eff8>] __alloc_pages+0x65/0x2ce
 [<ffffffff8852010a>] :dlm:process_recv_sockets+0x0/0x16
 [<ffffffff885211b7>] :dlm:receive_from_sock+0x68d/0x7f4
 [<ffffffff80033275>] lock_sock+0xa7/0xb2
 [<ffffffff80142cf2>] __next_cpu+0x19/0x28
 [<ffffffff8008984f>] find_busiest_group+0x20d/0x621
 [<ffffffff80049734>] worker_thread+0x0/0x122
 [<ffffffff8852011a>] :dlm:process_recv_sockets+0x10/0x16
 [<ffffffff8004ce1f>] run_workqueue+0x94/0xe4
 [<ffffffff80049734>] worker_thread+0x0/0x122
 [<ffffffff8009daed>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80049824>] worker_thread+0xf0/0x122
 [<ffffffff8008ab24>] default_wake_function+0x0/0xe
 [<ffffffff8009daed>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032518>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009daed>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8003241a>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11

debugfs stress_waiters showed this (don't know if it's related):
f00020 1 2 resource13

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:

Expected results:

Additional info:
Comment 1 David Teigland 2009-05-01 15:17:48 EDT
I'm keeping a note about this outside bz for whenever I happen to be working in this section of the code again.

Note You need to log in before you can comment on or make changes to this bug.