Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 432824

Summary: GFS2: warning: assertion "al->al_alloced" failed in alloc_page_backing
Product: Red Hat Enterprise Linux 5 Reporter: Nate Straz <nstraz>
Component: kernelAssignee: Don Zickus <dzickus>
Status: CLOSED ERRATA QA Contact: GFS Bugs <gfs-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 5.2CC: edamato, lwang, rkenna, rpeterso, swhiteho
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0314 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 15:09:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
program to recreate bug none

Description Nate Straz 2008-02-14 16:40:59 UTC
Description of problem:

This new backtrace showed up on all four of my nodes 4-5 times during a test
run.  I'm not sure which test case was running at the time.  Each node was
running an independent load with multiple test cases running at once.

GFS2: fsid=morph-cluster:brawl0.0: warning: assertion "al->al_alloced" failed
GFS2: fsid=morph-cluster:brawl0.0:   function = alloc_page_backing, file =
/builddir/build/BUILD/gfs2-kmod-1.79/_kmod_build_PAE/ops_vm.c, line = 94
 [<f8d99d62>] gfs2_assert_warn_i+0x7e/0x113 [gfs2]
 [<f8d92814>] gfs2_sharewrite_nopage+0x24c/0x2bb [gfs2]
 [<f8d9260b>] gfs2_sharewrite_nopage+0x43/0x2bb [gfs2]
 [<c045f2de>] __handle_mm_fault+0x1d0/0xb62
 [<f8d85f51>] gfs2_glock_nq+0x16b/0x18b [gfs2]
 [<c042de3a>] lock_timer_base+0x15/0x2f
 [<c04e2822>] prio_tree_insert+0x1b/0x1f2
 [<c0609726>] do_page_fault+0x2a5/0x5d3
 [<c0609481>] do_page_fault+0x0/0x5d3
 [<c0405a71>] error_code+0x39/0x40
 =======================

Version-Release number of selected component (if applicable):
kernel-2.6.18-79.el5
kmod-gfs2-1.79-1.4.el5

How reproducible:
Unknown

Comment 1 Nate Straz 2008-02-14 18:15:19 UTC
Raising the flags since this is a recent regression. 

Comment 3 Nate Straz 2008-02-19 16:53:36 UTC
I ran through brawl again and found that the messages only showed up while the
tests were running on a file system with a 1k block size.

Comment 4 Nate Straz 2008-02-19 17:45:20 UTC
I ran the test cases from d_io one at a time and it looks like the tag
"genesis_reg" is the reproducer.

genesis -i 30s -n 1000 -d 100 -p 10  -L flock -s 1048576  -w /mnt/gfs2

Comment 5 Steve Whitehouse 2008-02-22 16:07:38 UTC
Take a look at gfs2_write_alloc_required() as I suspect that you'll find the
answer in the recent changes to that function.

Comment 6 Abhijith Das 2008-02-26 20:34:02 UTC
I traced the cause of this assert-warning to a code-change to
gfs2_write_alloc_required() as part of the patch to bug 253990.

@@ -1226,8 +1193,13 @@ int gfs2_write_alloc_required(struct gfs
 		do_div(lblock_stop, bsize);
 	} else {
 		unsigned int shift = sdp->sd_sb.sb_bsize_shift;
+		u64 end_of_file = (ip->i_di.di_size + sdp->sd_sb.sb_bsize - 1) >> shift;
 		lblock = offset >> shift;
 		lblock_stop = (offset + len + sdp->sd_sb.sb_bsize - 1) >> shift;
+		if (lblock_stop > end_of_file) {
+			*alloc_required = 1;
+			return 0;
+		}
 	}
 
 	for (; lblock < lblock_stop; lblock += extlen) {
		error = gfs2_extent_map(&ip->i_inode, lblock, &new, &dblock, &extlen);
		if (error)
			return error;

		if (!dblock) {
			*alloc_required = 1;
			return 0;
		}
	}

Here, we check if the requested write is beyond the end of the file, if yes, we
assume allocation is required and set alloc_required = 1. This saves the looping
call to gfs2_extent_map below to determine if the underlying disk blocks are
alloced or not.

However, in the case where we trip this assert warning, the disk-blocks are
already alloced beyond the end of file, but we still set alloc_required = 1.
gfs2 then goes on to alloc_page_backing() to find that the blocks are already
alloced and trips the warning.

One solution is the remove the assert-warning. There's a little bit of wasteful
work being done to determine if the blocks are already allocated, but it doesn't
break anything.

Another way is to amend the patch above to consider the case where blocks beyond
the end of the file are allocated, and if so, return alloc_required = 0.

Steve/Bob, your thoughts?

Comment 7 Steve Whitehouse 2008-02-26 23:01:20 UTC
I guess the question is why those blocks are beyond the end of the file and
apparently already allocated? I wonder if its a result of truncate not
truncating to the correct boundary perhaps.

Provided we are sure that the fact that the blocks already exist is harmless,
then I'm happy just to comment out the warning.


Comment 8 Abhijith Das 2008-02-29 21:42:11 UTC
Created attachment 296422 [details]
program to recreate bug

Paths, filenames and numbers are hard-coded and there's no error checking
whatsoever.
Just make sure you mkfs.gfs2 with blocksize 1024

Comment 9 Abhijith Das 2008-03-08 22:31:00 UTC
Posted patch to comment out the assert warning to rhkernel-list
http://post-office.corp.redhat.com/archives/rhkernel-list/2008-March/msg00241.html

Comment 10 Don Zickus 2008-03-12 19:41:38 UTC
in kernel-$NEW_VER
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 11 Don Zickus 2008-03-12 20:00:17 UTC
in kernel-2.6.18-85.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 14 errata-xmlrpc 2008-05-21 15:09:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html