Bug 167650 - NMI watchdog lockup while attempting nfs traffic on gfs during ip relocation
NMI watchdog lockup while attempting nfs traffic on gfs during ip relocation
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Ben Marzinski
GFS Bugs
Depends On:
Blocks: 165449
  Show dependency treegraph
Reported: 2005-09-06 12:25 EDT by Corey Marthaler
Modified: 2010-01-11 22:07 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-05-04 11:38:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2005-09-06 12:25:38 EDT
Description of problem:
This occured while attempting to reproduce bz #166701 with Ben's fix. 

I had very similar cluster setup:
Steps to Reproduce:
1. Configure a service IP address on a two node cluster (link-02 & link-08)
2. Export a GFS filesystem over NFS
3. Mount the GFS export from an NFS client (link-01) using the service IP address
4. Generate read/write traffic over the NFS mount so that the cpu load is at 
least 50%
5. Use a simple script to move the service IP address between the two nodes 
using clusvcadm -r every 10 seconds.

How reproducible:
seen it only once so far

NMI Watchdog detected LOCKUP, CPU=1, registers:
Modules linked in: nfsd(U) exportfs(U) lockd(U) gfs(U) lock_dlm(U)
lock_harness(U) dlm(U) cman(U) parport_pc(U) lp(U) parport(U) autofs4(U)
i2c_dev(U) i2c_core(U) md5(U) ipv6(U) sunrpc(U) ds(U) yenta_socket(U)
pcmcia_core(U) button(U) battery(U) ac(U) ohci_hcd(U) hw_random(U) tg3(U)
floppy(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) ext3(U) jbd(U) dm_mod(U)
qla2300(U) qla2xxx(U) scsi_transport_fc(U) sd_mod(U) scsi_mod(U)
Pid: 6875, comm: nfsd Tainted: G   M  2.6.9-11.kdbsmp
RIP: 0010:[<ffffffff8015b734>] <ffffffff8015b734>{cache_alloc_refill+352}
RSP: 0018:0000010037aa9b58  EFLAGS: 00000013
RAX: 000001001bca8000 RBX: 0000000000000025 RCX: 00000100331a5000
RDX: 000001003ff6f4c8 RSI: 00000000000000d0 RDI: 000001003ff6f528
RBP: 0000010037e3b000 R08: 0000000000000010 R09: 0000000000000000
R10: 0000010024041f00 R11: 0000000000000070 R12: 000001003ff6f4c8
R13: 000001003ff6f480 R14: 0000010024041b00 R15: 0000010037aa9c68
FS:  0000002a9589eb00(0000) GS:ffffffff804eb900(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006c3530 CR3: 000000003ff38000 CR4: 00000000000006e0
Process nfsd (pid: 6875, threadinfo 0000010037aa8000, task 00000100326bb030)
Stack: 000000d03ff6f480 000001003ff6f480 00000000000000d0 000001000ec00580
       ffffff0000263000 0000010024041b00 0000010037aa9c68 ffffffff8015b503
       0000000000000202 0000000000000070
Call Trace:<ffffffff8015b503>{__kmalloc+123} <ffffffffa024fd8d>{:gfs:gmalloc+15}
       <ffffffffa0243e07>{:gfs:gfs_create+243} <ffffffff80180399>{vfs_create+210}
       <ffffffffa0287245>{:nfsd:nfsd+0} <ffffffffa028747d>{:nfsd:nfsd+568}
       <ffffffff80110cab>{child_rip+8} <ffffffffa0287245>{:nfsd:nfsd+0}
       <ffffffffa0287245>{:nfsd:nfsd+0} <ffffffff80110ca3>{child_rip+0}
Comment 1 Ben Marzinski 2005-09-19 15:03:20 EDT
Um.  If we can reproduce this, I'll look at it. It may have gotten fixed with the
other change I made to this call path.

Note You need to log in before you can comment on or make changes to this bug.