Bug 515267 - NFS over GFS problem - invalid metadata block
Summary: NFS over GFS problem - invalid metadata block
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: GFS-kernel
Version: 4.9
Hardware: All
OS: Linux
urgent
high
Target Milestone: ---
: 4.9
Assignee: Robert Peterson
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 455696
Blocks: 674403
TreeView+ depends on / blocked
 
Reported: 2009-08-03 14:55 UTC by Robert Peterson
Modified: 2018-11-14 19:02 UTC (History)
22 users (show)

Fixed In Version: GFS-kernel-2.6.9-86.1.el4
Doc Type: Bug Fix
Doc Text:
Clone Of: 455696
: 674403 (view as bug list)
Environment:
Last Closed: 2011-02-16 16:34:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to try (3.33 KB, patch)
2009-08-03 15:05 UTC, Robert Peterson
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0276 0 normal SHIPPED_LIVE GFS-kernel bug-fix update 2011-02-16 16:33:34 UTC

Comment 1 Robert Peterson 2009-08-03 15:05:19 UTC
Created attachment 356047 [details]
Patch to try

This patch fixed the problem on my cluster.  I'd like the users
to try it and report whether it worked properly for them.

Comment 2 Robert Peterson 2009-08-03 15:06:17 UTC
Setting NEEDINFO flag until I hear back on the results from the
patch in comment #1.

Comment 4 Robert Peterson 2010-02-04 22:34:31 UTC
It's been six months and I still have not heard whether the
patch fixes the customer's problem.  I'll close this as
INSUFFICIENT_DATA for now.  If the results come in, we can re-open
it.

Comment 5 Thomas Merz 2010-06-07 12:59:51 UTC
We were not able to reproduce the issue using the newest RedHat provided RPMs for RHEL4, so the problem seems to be fixed.

Comment 7 Thomas Merz 2010-06-15 16:34:33 UTC
Little Add-On to my Comment #5:
With the "patch to try" and the newest RedHat provided packages we were not
able to reproduce the issue.

Comment 8 Robert Peterson 2010-06-16 13:32:19 UTC
I'll try to get this patch into 4.9 then.  Requesting ack flags
accordingly.

Comment 9 Robert Peterson 2010-06-17 21:06:38 UTC
The patch was pushed to the RHEL4 and RHEL49 branches of the
cluster git tree for inclusion into 4.9.  It was tested by
me a long time ago on the trin cluster, and by various customers
as shown in comment #6 above.  Changing status to POST.
Chris Feist does the builds for RHEL4 so I'm reassigning to him
to get this into a build.

Comment 11 Nate Straz 2011-01-14 21:16:04 UTC
I wrote a new regression test and was able to recreate the bug using RHEL 4.7.  I will let the regression test run on 4.9 over the weekend before marking this verified.

Comment 12 Nate Straz 2011-01-17 14:47:42 UTC
I hit the get_leaf assertion while running the new regression test.

GFS: fsid=dash-cluster:dash-cluster0.2: fatal: invalid metadata block
GFS: fsid=dash-cluster:dash-cluster0.2:   bh = 654416609 (type: exp=6, found=0)
GFS: fsid=dash-cluster:dash-cluster0.2:   function = get_leaf
GFS: fsid=dash-cluster:dash-cluster0.2:   file = /builddir/build/BUILD/gfs-kernel-2.6.9-87/up/src/gfs/dir.c, line = 438
GFS: fsid=dash-cluster:dash-cluster0.2:   time = 1295140811
GFS: fsid=dash-cluster:dash-cluster0.2: about to withdraw from the cluster
GFS: fsid=dash-cluster:dash-cluster0.2: waiting for outstanding I/O
------------[ cut here ]------------
kernel BUG at /builddir/build/BUILD/gfs-kernel-2.6.9-87/up/src/gfs/lm.c:190!
invalid operand: 0000 [#1]
Modules linked in: vfat fat nfs nfsd exportfs lockd nfs_acl lock_dlm(U) dm_cmirror(U) gnbd(U) lock_nolock(U) gfs(U) lock_harness(U) dlm(U) 
cman(U) parport_pc lp parport autofs4 i2c_dev i2c_core md5 ipv6 sunrpc cpufreq_powersave button battery ac uhci_hcd ehci_hcd i3000_edac edac_
mc tg3 qla2400 qla2xxx scsi_transport_fc dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata sd_mod scsi_mod
CPU:    0
EIP:    0060:[<f912546c>]    Not tainted VLI
EFLAGS: 00010202   (2.6.9-94.EL) 
EIP is at gfs_lm_withdraw+0x50/0xbc [gfs]
eax: 00000044   ebx: f916f94c   ecx: f9148456   edx: dfb09da4
esi: f915b000   edi: 00000000   ebp: f915b000   esp: dfb09db8
ds: 007b   es: 007b   ss: 0068
Process find (pid: 10901, threadinfo=dfb09000 task=f5b412a0)
Stack: f916f94c cb3fd400 f9144647 f915b000 f914cb87 f916f94c f916f94c 27019ae1 
       00000000 00000006 00000000 f916f94c f9144a0f f916f94c f91464be 000001b6 
       f916f94c 4d3247cb ecb7cf1c f910e544 00000000 f9144a0f f91464be 000001b6 
Call Trace:
 [<f9144647>] gfs_metatype_check_ii+0x34/0x3f [gfs]
 [<f910e544>] get_leaf+0xc1/0xd5 [gfs]
 [<f911051d>] dir_e_read+0x1f2/0x2c9 [gfs]
 [<f9110c24>] gfs_dir_read+0x18/0x25 [gfs]
 [<f9131a9d>] filldir_reg_func+0x0/0x12c [gfs]
 [<f9131cd3>] readdir_reg+0x10a/0x12c [gfs]
 [<f9131a9d>] filldir_reg_func+0x0/0x12c [gfs]
 [<c0183d99>] filldir64+0x0/0x11a
 [<c0183d99>] filldir64+0x0/0x11a
 [<c0183d99>] filldir64+0x0/0x11a
 [<f9132098>] gfs_readdir+0x4e/0x5b [gfs]
 [<c0183a02>] vfs_readdir+0x8a/0xb7
 [<c018404f>] sys_getdents64+0x80/0xba
 [<c03246eb>] syscall_call+0x7/0xb
 [<c032007b>] packet_recvmsg+0xef/0x11a

Comment 13 Robert Peterson 2011-02-01 18:59:18 UTC
We discussed this problem in our weekly meeting.  We decided
that the patch makes things better, not worse, so although
the problem apparently isn't completely fixed, shipping the
patch in 4.9 is better than not shipping it.

Bug #674403 was opened to address any ongoing issues.
Changing status to ON_QA.

Comment 14 errata-xmlrpc 2011-02-16 16:34:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0276.html


Note You need to log in before you can comment on or make changes to this bug.