Bug 129007

Summary:	Bad metadata assertion failed: "metatype_check_magic == GFS_MAGIC && metatype_check_type == ((height) ? (5) : (4))"
Product:	[Retired] Red Hat Cluster Suite	Reporter:	Corey Marthaler <cmarthal>
Component:	gfs	Assignee:	Ben Marzinski <bmarzins>
Status:	CLOSED WORKSFORME	QA Contact:	Cluster QE <mspqa-list>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4
Target Milestone:	---
Target Release:	---
Hardware:	i686
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-11-22 21:29:17 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2004-08-02 22:18:01 UTC

Description of problem: 
This assertion tripped as revolver happened to be building the 
sistina-test tree on each on the nodes in the morph cluster, I'm 
sure that was not the cause however. 
 
It had checked that all the filesystem were fine on each of the 
nodes and then started the build of sistina-test and then when it 
was time to starting shooting nodes, morph-06 had already paniced. 
 
Aug  2 16:46:19 morph-06 kernel: dlm: gfs2: ignore master reply 
101f0 4 
Aug  2 16:46:49 morph-06 kernel: Bad metadata at 65812, should be 5 
Aug  2 16:46:49 morph-06 kernel:   mh_magic = 0x01161970 
Aug  2 16:46:49 morph-06 kernel:   mh_type = 0 
Aug  2 16:46:49 morph-06 kernel:   mh_generation = 0 
Aug  2 16:46:49 morph-06 kernel:   mh_format = 0 
Aug  2 16:46:49 morph-06 kernel:   mh_incarn = 0 
Aug  2 16:46:49 morph-06 kernel: 
Aug  2 16:46:49 morph-06 kernel: GFS: Assertion failed on line 1181 
of file /usr/src/cluster/gfs-kernel/src/gfs/dio.c 
Aug  2 16:46:49 morph-06 kernel: GFS: assertion: 
"metatype_check_magic == GFS_MAGIC && metatype_check_type == 
((height) ? (5) : (4))" 
Aug  2 16:46:49 morph-06 kernel: GFS: time = 1091483209 
Aug  2 16:46:49 morph-06 kernel: GFS: fsid=morph-cluster:gfs0new.0 
Aug  2 16:46:49 morph-06 kernel: 
Aug  2 16:46:49 morph-06 kernel: Kernel panic: GFS: Record message 
above and reboot. 
 
 
Aug  2 16:46:00 morph-05 kernel: CMAN: no HELLO from morph-06, 
removing from the cluster 
Aug  2 16:46:05 morph-05 kernel: dlm: gfs4: recover event 565 
Aug  2 16:46:05 morph-05 kernel: dlm: gfs4: remove node 1 
 
 
How reproducible: 
Didn't try

Comment 1 Corey Marthaler 2004-09-16 14:29:16 UTC

I reproduced this panic on 4 out of six nodes last night after about 
3 - 4 hours of I/O load. 
 
The load was: 
genesis  
accordion  
growfiles  
iogen/doio

Comment 2 Corey Marthaler 2004-10-29 14:31:31 UTC

reproduced this again last night while running above I/O.

Comment 3 Ken Preslan 2004-11-16 19:20:09 UTC

Reassign

Comment 4 Corey Marthaler 2004-12-14 19:04:23 UTC

This appears to be the same assertion with a stack trace this time.
I've been seeing this quite a bit while running revolver lately

GFS: fsid=morph-cluster:corey0.3: jid=2: Busy
dlm: corey0: resent 0 requests
dlm: corey0: recover event 78 finished
Info fld=0x0, Current sda: sense key No Sense
Bad metadata at 67395942, should be 4
  mh_magic = 0x05004400
  mh_type = 3305182976
  mh_generation = 288230380446679040
  mh_format = 16777216
  mh_incarn = 0
 [<f8a4de72>] gfs_assert_i+0x32/0xc0 [gfs]
 [<c01230c1>] vprintk+0x111/0x160
 [<c0122fa7>] printk+0x17/0x20
 [<f8a3837e>] gfs_meta_header_print+0x6e/0x80 [gfs]
 [<f8a1f579>] gfs_get_meta_buffer+0x1e9/0x360 [gfs]
 [<f8a2de5d>] gfs_copyin_dinode+0x2d/0x1b0 [gfs]
 [<c011f2d0>] default_wake_function+0x0/0x10
 [<c011f2d0>] default_wake_function+0x0/0x10
 [<f8a2d52d>] inode_go_lock+0x4d/0x60 [gfs]
 [<f8a2a6e5>] glock_wait_internal+0x105/0x220 [gfs]
 [<f8a2aa9f>] gfs_glock_nq+0x6f/0x100 [gfs]
 [<f8a2b1ae>] gfs_glock_nq_init+0x1e/0x40 [gfs]
 [<f8a4286a>] gfs_permission+0x4a/0x80 [gfs]
 [<f8a42820>] gfs_permission+0x0/0x80 [gfs]
 [<c0169708>] permission+0x68/0x70
 [<c016b1d7>] may_open+0x47/0x260
 [<c016b4a1>] open_namei+0xb1/0x650
 [<c015bb9d>] filp_open+0x2d/0x60
 [<c015be08>] get_unused_fd+0x78/0xd0
 [<c015bf4c>] sys_open+0x3c/0xa0
 [<c0105f5d>] sysenter_past_esp+0x52/0x71
Kernel panic - not syncing: GFS: fsid=morph-cluster:corey0.3:
assertion "(metatype_check_magic == (0x01161970) &&
metatype_check_type == ((height) ? (5) : (4)))" failed
GFS: fsid=morph-cluster:corey0.3:   function = gfs_get_meta_buffer
GFS: fsid=morph-cluster:corey0.3:   file =
/usr/src/cluster/gfs-kernel/src/gfs/dio.c, line = 1214
GFS: fsid=morph-cluster:corey0.3:   time = 1103049397

Comment 5 Ben Marzinski 2005-01-13 16:30:46 UTC

I've been running the revolver load set to HEAVY on four filesystems in
my cluster for over two days now, with no sign of this bug. Corey ran
his tests again last night, and did not see the bug.  Either it got
fixed as a side effect of fixing something else, or it will rear it's
ugly head later... in which case, I'll deal with it then.

Comment 6 Corey Marthaler 2005-11-22 21:29:17 UTC

this has not been seen in almost a year, closing.