Bug 129259

Summary:	Bad metadata assertion while attempting to mount many filesystems
Product:	[Retired] Red Hat Cluster Suite	Reporter:	Corey Marthaler <cmarthal>
Component:	gfs	Assignee:	Ken Preslan <kpreslan>
Status:	CLOSED WORKSFORME	QA Contact:	GFS Bugs <gfs-bugs>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4
Target Milestone:	---
Target Release:	---
Hardware:	i686
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-01-31 21:31:20 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2004-08-05 16:30:55 UTC

Description of problem:
I was trying to remount my 100 filesystems on my morph-cluster which
caused morph-01 to hit this assertion:

dlm: gfs0: rebuilt 0 resources
dlm: gfs0: recover event 13 done
dlm: gfs0: recover event 13 finished
GFS: fsid=morph-cluster:gfs0.0: Joined cluster. Now mounting FS...
GFS: fsid=morph-cluster:gfs0.0: jid=0: Trying to acquire journal lock...
GFS: fsid=morph-cluster:gfs0.0: jid=0: Looking at journal...
GFS: fsid=morph-cluster:gfs0.0: jid=0: Acquiring the transaction lock...
GFS: fsid=morph-cluster:gfs0.0: jid=0: Replaying journal...
Bad metadata at 251078
  mh_magic = 0x20332036
  mh_type = 540105994
  mh_generation = 740637986563911541
  mh_format = 1853104199
  mh_incarn = 1919907184

GFS: Assertion failed on line 466 of file
/usr/src/cluster/gfs-kernel/src/gfs/lops.c
GFS: assertion: "meta_check_magic == GFS_MAGIC"
GFS: time = 1091723123
GFS: fsid=morph-cluster:gfs0.0

Kernel panic: GFS: Record message above and reboot.


How reproducible:
Didn't try

Comment 1 Corey Marthaler 2004-08-05 19:00:53 UTC

I can reproduce this and have narrowed it down to just the first of
the the 100 filesystems. If I attempt to mount any of the remaining 99
filesystems it works, but an attempt to mount the first one either
asserts or hangs.

Comment 2 Corey Marthaler 2004-08-05 19:03:56 UTC

super block of fs in question:

[root@morph-02 root]# gfs_tool sb /dev/gfs/lvol0 all
  mh_magic = 0x01161970
  mh_type = 1
  mh_generation = 0
  mh_format = 100
  mh_incarn = 0
  sb_fs_format = 1309
  sb_multihost_format = 1401
  sb_flags = 0
  sb_bsize = 4096
  sb_bsize_shift = 12
  sb_seg_size = 16
  no_formal_ino = 21
  no_addr = 21
  no_formal_ino = 22
  no_addr = 22
  no_formal_ino = 25
  no_addr = 25
  sb_lockproto = lock_dlm
  sb_locktable = morph-cluster:gfs0
  no_formal_ino = 23
  no_addr = 23
  no_formal_ino = 24
  no_addr = 24
  sb_reserved =
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Comment 3 Adam "mantis" Manthei 2004-08-06 00:06:16 UTC

What sort of loads where you using prior to running into this assert?
 It looks as though you may have corrupted your filesystem.  It would
be interesting to see if you can reproduce this assert using nolock
and iterating over the journal ids with the "jid=" mount option.  If
you can, then maybe look at gfs.fsck?

Assinging this to Ken in the meantime.

Comment 4 Corey Marthaler 2005-01-10 22:44:08 UTC

I'm sure this was the result of a corrupted filesystem and I'll bet a
gfs.fsck would have fixed it. The load being run was doio/iogen,
genesis, and accordion.

Comment 6 Corey Marthaler 2005-01-31 21:31:20 UTC

hasn't been seen in almost 6 months.