Bug 602414 - Machine locks up when running SysRq-w
Machine locks up when running SysRq-w
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: rc
: ---
Assigned To: Red Hat Kernel Manager
Red Hat Kernel QE team
Depends On:
  Show dependency treegraph
Reported: 2010-06-09 15:48 EDT by Jon Thomas
Modified: 2012-06-14 15:54 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-06-14 15:54:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Jon Thomas 2010-06-09 15:48:23 EDT
Analysis notes from Ulrich Obergfell 
the system hang of cluster member  was caused by an AB <-> BA deadlock
condition which involves the smp_call_function() mechanism. These are the
stack traces of the two deadlocking threads (in vmcore.snosea.system-hang).

crash> set -c 6
  PID: 16798
COMMAND: "java"
 TASK: 10511d587f0  [THREAD_INFO: 10438d24000]
  CPU: 6

crash> bt -I 0xffffffff8011d0cb -S 0x10438d25ac8
PID: 16798  TASK: 10511d587f0       CPU: 6   COMMAND: "java"
#0 [10438d25ac8] __smp_call_function at ffffffff8011d0cb
#1 [10438d25b40] smp_call_function at ffffffff8011d118
#2 [10438d25b70] flush_tlb_all at ffffffff8011d149
#3 [10438d25b80] remove_vm_area at ffffffff80172dd6
#4 [10438d25ba0] __vunmap at ffffffff80172e2a
#5 [10438d25bc0] destroy_context at ffffffff8011579c
#6 [10438d25bd0] __mmdrop at ffffffff80136710
#7 [10438d25be0] flush_old_exec at ffffffff8018696d
#8 [10438d25c70] load_elf_binary at ffffffff801a652c
#9 [10438d25db0] search_binary_handler at ffffffff80186fa2
#10 [10438d25df0] load_script at ffffffff801a5c21
#11 [10438d25ea0] search_binary_handler at ffffffff80186fa2
#12 [10438d25ee0] compat_do_execve at ffffffff801a3d05
#13 [10438d25f20] sys32_execve at ffffffff801289c7
#14 [10438d25f50] ia32_ptregs_common at ffffffff80126c3d
  RIP: 00000000ffffe410  RSP: 00000000f3ef5cc8  RFLAGS: 00000216
  RAX: 000000000000000b  RBX: 00000000f3ef6ed0  RCX: 00000000088263e0
  RDX: 0000000008051378  RSI: 0000000000000400  RDI: 000000000096cff4
  RBP: 00000000f3ef5cc8   R8: 0000000000000000   R9: 0000000000000000
  R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
  R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
  ORIG_RAX: 000000000000000b  CS: 0023  SS: 002b

The 'java' process on CPU 6 has acquired the 'call_lock' and has sent a
'TLB flush' request to the other 7 CPUs via a inter-processor interrupt.
The 'started' and 'finished' counters in the 'call_data' structure indicate
that six of seven CPUs were able to process the interrupt. A response was
still outstanding from CPU 4.

crash> px call_lock
call_lock = $2 = {
lock = 0xffffffff = -1,  i.e. locked and one waiter
magic = 0xdead4ead

crash> px *call_data
$3 = {
func = 0xffffffff8011ced7 <do_flush_tlb_all>,
info = 0x0,
started = {
  counter = 0x6
finished = {
  counter = 0x6
wait = 0x1

CPU 4 was executing the 'sysrq-w' issued by the 'show_CPUs.sh' script which
was running to monitor high CPU load situations.

crash> set -c 4
  PID: 17133
COMMAND: "show_CPUs.sh"
 TASK: 1050c0a37f0  [THREAD_INFO: 1039c31a000]
  CPU: 4

crash> bt -I 0xffffffff8011d108 -S 0x1039c31be80
PID: 17133  TASK: 1050c0a37f0       CPU: 4   COMMAND: "show_CPUs.sh"
#0 [1039c31be80] smp_call_function at ffffffff8011d108
#1 [1039c31beb0] __handle_sysrq at ffffffff8023f89b
#2 [1039c31bef0] write_sysrq_trigger at ffffffff801b4709
#3 [1039c31bf10] vfs_write at ffffffff8017c432
#4 [1039c31bf40] sys_write at ffffffff8017c51a
#5 [1039c31bf80] system_call at ffffffff801102de
  RIP: 00000039c05bc8d2  RSP: 0000007fbfffee00  RFLAGS: 00010203
  RAX: 0000000000000001  RBX: ffffffff801102de  RCX: 00000000006c0077
  RDX: 0000000000000002  RSI: 0000002a95557000  RDI: 0000000000000001
  RBP: 0000000000000002   R8: 0000000000000001   R9: 0000002a955653e0
  R10: 0000000000000053  R11: 0000000000000246  R12: 00000039c07328c0
  R13: 0000002a95557000  R14: 0000000000000002  R15: 0000000000000000
  ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

There appears to be a bug in sysrq_handle_showcpus() which uses the
smp_call_function() mechanism even though __handle_sysrq() has blocked
all interrupts.

void __handle_sysrq(int key, struct pt_regs *pt_regs, struct tty_struct *tty)
      // A response to the inter-processor interrupt was not possible
      // because interrupts have been blocked by spin_lock_irqsave().
      spin_lock_irqsave(&sysrq_key_table_lock, flags);
      if (op_p) {
              // call sysrq_handle_showcpus()
              op_p->handler(key, pt_regs, tty);

static void sysrq_handle_showcpus(int key, struct pt_regs *pt_regs,
                                struct tty_struct *tty) {
      // smp_call_function() attempted to acquire the 'call_lock' too.
      // However, this was already being held/owned by the 'java' process
      // on CPU 6. Hence, the 'sysrq-w' handler was unable to make progress.
      // On the other hand, the 'java' process could not make progress either
      // because CPU 4 did not respond to the inter-processor interrupt.
      smp_call_function(showacpu, NULL, 0, 0);

It appears that cluster member smotie had a dependency to cluster member snosea
via 'syslogd' and GFS. Stack trace of 'syslogd' (in vmcore.smotie.system-hang).

PID: 3265   TASK: 105a6eec030       CPU: 1   COMMAND: "syslogd"
#0 [105a867da98] schedule at ffffffff80314a94
#1 [105a867db70] wait_for_completion at ffffffff80314cd8
#2 [105a867dbf0] glock_wait_internal at ffffffffa019e88c
#3 [105a867dc30] gfs_glock_nq at ffffffffa019f0ca
#4 [105a867dc70] do_write_buf at ffffffffa01b36cf
#5 [105a867dd30] walk_vm at ffffffffa01b2853
#6 [105a867de20] __gfs_write at ffffffffa01b3d17
#7 [105a867de60] do_readv_writev at ffffffff8017c829
#8 [105a867df40] sys_writev at ffffffff8017c9d1
#9 [105a867df80] system_call at ffffffff801102de
  RIP: 0000002a9572ea37  RSP: 0000007fbffff778  RFLAGS: 00000246
  RAX: 0000000000000014  RBX: ffffffff801102de  RCX: ffffffffffffffff
  RDX: 0000000000000006  RSI: 0000007fbfffedd0  RDI: 0000000000000009
  RBP: 0000007fbfffedd0   R8: fefefefefefefeff   R9: 0000000000000031
  R10: 0000000000000035  R11: 0000000000000246  R12: 0000000000000009
  R13: 0000007fbfffedd0  R14: 0000000000000006  R15: 00000000bfffeee0
  ORIG_RAX: 0000000000000014  CS: 0033  SS: 002b

Note You need to log in before you can comment on or make changes to this bug.