Bug 487026 - GFS: gfs_grow causes lock_dlm: exxonfs: gdlm_lock 2,17 err=-16
GFS: gfs_grow causes lock_dlm: exxonfs: gdlm_lock 2,17 err=-16
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: GFS-kernel (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: David Teigland
Cluster QE
: 495968 (view as bug list)
Depends On: 438268
  Show dependency treegraph
Reported: 2009-02-23 13:26 EST by Nate Straz
Modified: 2010-10-23 03:51 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 438268
Last Closed: 2009-05-18 17:10:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1045 normal SHIPPED_LIVE GFS-kernel bug-fix update 2009-05-18 17:09:29 EDT

  None (edit)
Description Nate Straz 2009-02-23 13:26:22 EST
+++ This bug was initially created as a clone of Bug #438268 +++

While running gfs_grow tests on 4.8 I started seeing these messages:

dlm: grow1: process_lockqueue_reply id 10371 state 0
dlm: grow1: process_lockqueue_reply id 302b4 state 0
dlm: grow1: process_lockqueue_reply id 301f8 state 0
dlm: grow1: process_lockqueue_reply id 5011e state 0
dlm: grow1: process_lockqueue_reply id 40195 state 0
lock_dlm: lm_dlm_cancel 2,4b flags 80
dlm: grow1: (10920) dlm_unlock: a590313 busy 2
lock_dlm: lm_dlm_cancel rv -16 2,4b flags 40080
lock_dlm: lm_dlm_cancel 2,4b flags 80

Which eventually turned to:

dlm: grow1: cancel reply ret 0
dlm: grow1: process_lockqueue_reply id a590313 state 0
Unable to handle kernel paging request at virtual address 00100100
 printing eip:
*pde = 00004001
Oops: 0000 [#1]
Modules linked in: lock_dlm(U) dm_cmirror(U) gnbd(U) lock_nolock(U) gfs(U) lock_harness(U) dlm(U) cman(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc cpufreq_powersave loop button battery ac uhci_hcd hw_random e1000 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 ata_piix libata qla2xxx scsi_transport_fc sd_mod scsi_mod
CPU:    1
EIP:    0060:[<829e1a1a>]    Not tainted VLI
EFLAGS: 00010202   (2.6.9-80.ELhugemem) 
EIP is at process_lockqueue+0xd4/0x122 [dlm]
eax: 00000542   ebx: 001000d0   ecx: 829fd8e8   edx: 04433200
esi: 829fd8e8   edi: 001000d0   ebp: 00000000   esp: 70b40fb0
ds: 007b   es: 007b   ss: 0068
Process dlm_astd (pid: 8794, threadinfo=70b40000 task=80dacb30)
Stack: 00000000 829fd8a8 00000000 00000000 829e1b61 829e1cc6 70b40000 74167ea4 
       0213414d fffffffc ffffffff ffffffff 021340da 00000000 00000000 00000000 
       021041f5 74167e9c 00000000 00000000 
Call Trace:
 [<829e1b61>] dlm_astd+0x0/0x1a9 [dlm]
 [<829e1cc6>] dlm_astd+0x165/0x1a9 [dlm]
 [<0213414d>] kthread+0x73/0x9b
 [<021340da>] kthread+0x0/0x9b
 [<021041f5>] kernel_thread_helper+0x5/0xb
Code: 47 44 00 00 31 c9 ba 6b 00 00 00 b8 20 26 9f 82 e8 9c f1 73 7f e8 58 23 8f 7f 89 f1 f0 ff 0d e8 d8 9f 82 0f 88 2e 05 00 00 89 fb <8b> 7f 30 8d 43 30 83 ef 30 e9 5d ff ff ff b9 e8 d8 9f 82 f0 ff 
 <0>Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception

I talked with Bob and Dave about this and they thought it was the same as bug #438268 from RHEL5.

Package versions:
Comment 1 Nate Straz 2009-02-23 13:35:41 EST
How Reproducible:

Easily with the new growfs test which runs a lighter sequential I/O load while growing GFS file systems with a 1k block size.
Comment 2 Robert Peterson 2009-02-24 09:22:05 EST
Dave did the fix for the original problem and POSTed it.  I'm
assuming he'll do the crosswrite to 4.x, so I'm reassigning to him.
Comment 3 David Teigland 2009-02-24 12:19:28 EST
pushed to RHEL4 branch commit 5a6349be0bdba75d2b1cc90e5c5861d2661a6304
Comment 5 David Teigland 2009-04-15 15:34:52 EDT
*** Bug 495968 has been marked as a duplicate of this bug. ***
Comment 6 Nate Straz 2009-04-17 13:45:24 EDT
Verified against:

Comment 8 errata-xmlrpc 2009-05-18 17:10:03 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.