Bug 173951 - corruption detected in function gfs_get_meta_buffer during recovery
corruption detected in function gfs_get_meta_buffer during recovery
Status: CLOSED DUPLICATE of bug 175589
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Ben Marzinski
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-11-22 16:51 EST by Corey Marthaler
Modified: 2010-01-11 22:08 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-01-04 14:10:58 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2005-11-22 16:51:21 EST
Description of problem:
I hit this issue on link-01 while running revolver (with the LITE I/O load) on
an eight node cluster (link-01 - link-08).  Link-06, link-07, and link-08 were
shot by revolver and when replaying the journals on link-01 it this corruption.

<Nov/22 03:45 pm>CMAN: removing node link-08 from the cluster : No response to
messages
<Nov/22 03:46 pm>CMAN: removing node link-07 from the cluster : No response to
messages
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=4: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=4: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=4: Busy
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=4: Busy
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=3: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=3: Busy
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=0: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs2.2: jid=0: Busy
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=3: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=3: Busy
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=1: Trying to acquire journal lock...
<Nov/22 03:47 pm>GFS: fsid=LINK_128:gfs1.2: jid=1: Busy
<Nov/22 03:47 pm>CMAN: quorum lost, blocking activity
<Nov/22 03:50 pm>CMAN: quorum regained, resuming activity
<Nov/22 03:51 pm>GFS: fsid=LINK_128:gfs2.2: jid=1: Trying to acquire journal lock...
<Nov/22 03:51 pm>GFS: fsid=LINK_128:gfs1.2: jid=0: Trying to acquire journal lock...
<Nov/22 03:51 pm>GFS: fsid=LINK_128:gfs2.2: jid=1: Busy
<Nov/22 03:51 pm>GFS: fsid=LINK_128:gfs1.2: jid=0: Busy
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2: fatal: invalid metadata block
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2:   bh = 31109846 (type: exp=4, found=0)
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2:   function = gfs_get_meta_buffer
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2:   file =
/usr/src/build/643480-x86_64/BUILD/gfs-kernel-2.6.9-44/smp/src/gfs/dio.c,line = 1223
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2:   time = 1132676573
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2: about to withdraw from the cluster
<Nov/22 03:55 pm>GFS: fsid=LINK_128:gfs1.2: waiting for outstanding I/O
<Nov/22 03:55 pm>----------- [cut here ] --------- [please bite here ] ---------
<Nov/22 03:55 pm>Kernel BUG at lm:190
<Nov/22 03:55 pm>invalid operand: 0000 [1] SMP
<Nov/22 03:55 pm>CPU 1
<Nov/22 03:55 pm>Modules linked in: lock_dlm(U) gnbd(U) lock_nolock(U) gfs(U)
lock_harness(U) dlm(U) cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev
i2c_core sunrpc ds yenta_socket pcmcia_core dm_mod ohci_hcd hw_random tg3 floppy
ext3 jbd qla2300 qla2xxx scsi_trans<Nov/22 03:55 pm>port_fc sd_mod scsi_mod
<Nov/22 03:55 pm>Pid: 6206, comm: growfiles Tainted: G   M  2.6.9-22.0.1.ELsmp
<Nov/22 03:55 pm>RIP: 0010:[<ffffffffa021e807>]
<ffffffffa021e807>{:gfs:gfs_lm_withdraw+215}
<Nov/22 03:55 pm>RSP: 0018:0000010037af5af8  EFLAGS: 00010202
<Nov/22 03:55 pm>RAX: 0000000000000037 RBX: ffffff00001828c0 RCX: 0000000100000000
<Nov/22 03:55 pm>RDX: ffffffff803d78c8 RSI: 0000000000000246 RDI: ffffffff803d78c0
<Nov/22 03:55 pm>RBP: ffffff000014a000 R08: ffffffff803d78c8 R09: ffffff00001828c0
<Nov/22 03:55 pm>R10: ffffffff8011de14 R11: ffffffff8011de14 R12: 000001002a860528
<Nov/22 03:55 pm>R13: 000001002a8606b8 R14: 0000000000000000 R15: 0000000001dab2d6
<Nov/22 03:55 pm>FS:  0000002a95575f00(0000) GS:ffffffff804d3100(005b)
knlGS:00000000f7fdf6c0
<Nov/22 03:55 pm>CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
<Nov/22 03:55 pm>CR2: 00000000f7ffc000 CR3: 000000003ff38000 CR4: 00000000000006e0
<Nov/22 03:55 pm>Process growfiles (pid: 6206, threadinfo 0000010037af4000, task
000001003e749030)
<Nov/22 03:55 pm>Stack: 0000003000000030 0000010037af5c28 0000010037af5b18
ffffffffa0007ac4
<Nov/22 03:55 pm>       00000100022802e8 000001003fb54ab0 ffffff00001828c0
ffffff00001828c0
<Nov/22 03:55 pm>       0000000001dab2d6 0000000000000004
<Nov/22 03:55 pm>Call Trace:<ffffffffa0007ac4>{:scsi_mod:scsi_request_fn+1100}
<Nov/22 03:55 pm>       <ffffffff80303814>{io_schedule+37}
<ffffffff80178062>{__wait_on_buffer+143}
<Nov/22 03:55 pm>       <ffffffff80177ed6>{bh_wake_function+0}
<ffffffffa0236c83>{:gfs:gfs_metatype_check_ii+54}
<Nov/22 03:55 pm>       <ffffffffa020b52c>{:gfs:gfs_get_meta_buffer+580}
<ffffffffa0217985>{:gfs:gfs_copyin_dinode+23}
<Nov/22 03:55 pm>       <ffffffff8011de14>{flat_send_IPI_mask+0}
<ffffffffa021744d>{:gfs:inode_go_lock+38}
<Nov/22 03:55 pm>       <ffffffffa021457a>{:gfs:glock_wait_internal+563}
<ffffffffa0214cd2>{:gfs:gfs_glock_nq+961}
<Nov/22 03:55 pm>       <ffffffffa0214efb>{:gfs:gfs_glock_nq_init+20}
<ffffffffa022c8a7>{:gfs:gfs_permission+64}
<Nov/22 03:55 pm>       <ffffffffa02272e1>{:gfs:gfs_drevalidate+409}
<ffffffff80183086>{permission+51}
<Nov/22 03:55 pm>       <ffffffff80184dba>{may_open+88}
<ffffffff801852ab>{open_namei+788}
<Nov/22 03:55 pm>       <ffffffff80131c39>{finish_task_switch+55}
<ffffffff80176524>{filp_open+39}
<Nov/22 03:55 pm>       <ffffffff801e9fd5>{strncpy_from_user+74}
<ffffffff8017662d>{get_unused_fd+230}
<Nov/22 03:55 pm>       <ffffffff8012762f>{sys32_open+54}
<ffffffff8012500f>{cstar_do_call+27}
<Nov/22 03:55 pm>
<Nov/22 03:55 pm>
<Nov/22 03:55 pm>Code: 0f 0b 3b a8 23 a0 ff ff ff ff be 00 8b 85 a0 88 03 00 85 c0
<Nov/22 03:55 pm>RIP <ffffffffa021e807>{:gfs:gfs_lm_withdraw+215} RSP
<0000010037af5af8>
<Nov/22 03:55 pm> <0>Kernel panic - not syncing: Oops



Version-Release number of selected component (if applicable):
Kernel 2.6.9-22.0.1.ELsmp on an x86_64
CMAN 2.6.9-40.0 (built Nov  7 2005 15:30:36) installed
DLM 2.6.9-39.0 (built Nov 14 2005 17:38:14) installed
Lock_Harness 2.6.9-44.0 (built Nov 17 2005 15:43:18) installed
GFS 2.6.9-44.0 (built Nov 17 2005 15:43:35) installed
Lock_Nolock 2.6.9-44.0 (built Nov 17 2005 15:43:19) installed
Comment 1 Corey Marthaler 2005-12-13 14:14:50 EST
Just a note that it appears this issue has been seen outside of Redhat in bz 175589.
Comment 2 Ben Marzinski 2006-01-04 14:10:58 EST

*** This bug has been marked as a duplicate of 175589 ***

Note You need to log in before you can comment on or make changes to this bug.