Bug 465151 - GFS: madvise system call causes assertion
Summary: GFS: madvise system call causes assertion
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs-kmod
Version: 5.3
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Chris Feist
QA Contact: Cluster QE
URL:
Whiteboard:
: 464837 (view as bug list)
Depends On:
Blocks: 465325 465327
TreeView+ depends on / blocked
 
Reported: 2008-10-01 20:09 UTC by Abhijith Das
Modified: 2018-10-20 03:02 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 21:18:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
RHEL5.3 version of the proposed patch (2.97 KB, patch)
2008-10-01 22:10 UTC, Abhijith Das
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0132 0 normal SHIPPED_LIVE gfs-kmod bug-fix update 2009-01-20 16:04:57 UTC

Description Abhijith Das 2008-10-01 20:09:35 UTC
Description of problem:

Since the madvise system call was enabled by the patch to bug 429343, it's possible for a inode glock holder to never get dequeued through gfs_readpage. This causes an assertion (bug 464837)

GFS: fsid=cl102a:gfs1.1: warning: assertion "(gh->gh_flags & LM_FLAG_ANY) || (tmp_gh->gh_flags & LM_FLAG_ANY)" failed
GFS: fsid=cl102a:gfs1.1: function = add_to_queue
GFS: fsid=cl102a:gfs1.1: file = /builddir/build/BUILD/gfs-kmod-0.1.23/_kmod_build_/src/gfs/glock.c, line = 1418
GFS: fsid=cl102a:gfs1.1: time = 1222739610

I can't think of a way to make madvise work, and reverting the patch mentioned above and returning ENOSYS to madvise is the currently proposed solution to this problem.

I'll post a patch as soon as I verify that everything works as expected.

Comment 1 Chris Feist 2008-10-01 20:44:34 UTC
Setting acks.

Comment 2 Abhijith Das 2008-10-01 22:10:50 UTC
Created attachment 319171 [details]
RHEL5.3 version of the proposed patch

This patch reverts the patch to bz 429343. Don't log any warnings/errors and simply return ENOSYS when you arrive at gfs_readpage without the inode glock held. (madvise syscall case)

Comment 3 Kiersten (Kerri) Anderson 2008-10-02 13:56:31 UTC
Pushing out a 5.2.z release, so need to address in 5.3 as well.

Comment 4 Kiersten (Kerri) Anderson 2008-10-02 14:15:16 UTC
Guess this should be a blocker instead of an exception due to 5.2.z activity.

Comment 5 Chris Feist 2008-10-02 18:54:25 UTC
This has been committed and is being marked modified so we can get a zstream bug.

Comment 8 Abhijith Das 2008-10-07 15:27:52 UTC
*** Bug 464837 has been marked as a duplicate of this bug. ***

Comment 9 Chris Evich 2008-10-07 17:12:05 UTC
I'm probably completely misunderstanding this, but won't reversing the entire patch regress the 429343 fix?  


Isn't this:

-		return AOP_TRUNCATED_PAGE;
+		return -ENOSYS;

all that we need to do?

Comment 10 Abhijith Das 2008-10-07 17:41:20 UTC
Chris,

Take a look at https://bugzilla.redhat.com/show_bug.cgi?id=429343#c5
The "easier solution" is the one we've done here.

In the patch in bz 429343, doing

-  return AOP_TRUNCATED_PAGE;
+  return -ENOSYS;

is not correct. We shouldn't attempt to grab the inode glock if it's not already held in gfs_readpage. Notice that we don't free (dequeue) the inode glock until gfs_readpage is called again. Returning AOP_TRUNCATED_PAGE when gfs_readpage is called through the madvise code, will not cause gfs_readpage to be called again and this inode glock will never be freed. Replacing AOP_TRUNCATED_PAGE with -ENOSYS will definitely not solve this and we'll end up with the same unfreed glock.

Hope this clears things for you.
Also, it's worth noting that the original bug 429343 was the assert in gfs_readpage.

if (gfs_assert_warn(sdp, gfs_glock_is_locked_by_me(ip->i_gl))) {
	unlock_page(page);
	return -ENOSYS;
}

After the revert+change with the patch in comment #2, the only change we'll have is
- if (gfs_assert_warn(sdp, gfs_glock_is_locked_by_me(ip->i_gl))) {
+ if (!gfs_glock_is_locked_by_me(ip->i_gl)) {

Comment 13 errata-xmlrpc 2009-01-20 21:18:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0132.html


Note You need to log in before you can comment on or make changes to this bug.