Description of problem: Since the madvise system call was enabled by the patch to bug 429343, it's possible for a inode glock holder to never get dequeued through gfs_readpage. This causes an assertion (bug 464837) GFS: fsid=cl102a:gfs1.1: warning: assertion "(gh->gh_flags & LM_FLAG_ANY) || (tmp_gh->gh_flags & LM_FLAG_ANY)" failed GFS: fsid=cl102a:gfs1.1: function = add_to_queue GFS: fsid=cl102a:gfs1.1: file = /builddir/build/BUILD/gfs-kmod-0.1.23/_kmod_build_/src/gfs/glock.c, line = 1418 GFS: fsid=cl102a:gfs1.1: time = 1222739610 I can't think of a way to make madvise work, and reverting the patch mentioned above and returning ENOSYS to madvise is the currently proposed solution to this problem. I'll post a patch as soon as I verify that everything works as expected.
Setting acks.
Created attachment 319171 [details] RHEL5.3 version of the proposed patch This patch reverts the patch to bz 429343. Don't log any warnings/errors and simply return ENOSYS when you arrive at gfs_readpage without the inode glock held. (madvise syscall case)
Pushing out a 5.2.z release, so need to address in 5.3 as well.
Guess this should be a blocker instead of an exception due to 5.2.z activity.
This has been committed and is being marked modified so we can get a zstream bug.
*** Bug 464837 has been marked as a duplicate of this bug. ***
I'm probably completely misunderstanding this, but won't reversing the entire patch regress the 429343 fix? Isn't this: - return AOP_TRUNCATED_PAGE; + return -ENOSYS; all that we need to do?
Chris, Take a look at https://bugzilla.redhat.com/show_bug.cgi?id=429343#c5 The "easier solution" is the one we've done here. In the patch in bz 429343, doing - return AOP_TRUNCATED_PAGE; + return -ENOSYS; is not correct. We shouldn't attempt to grab the inode glock if it's not already held in gfs_readpage. Notice that we don't free (dequeue) the inode glock until gfs_readpage is called again. Returning AOP_TRUNCATED_PAGE when gfs_readpage is called through the madvise code, will not cause gfs_readpage to be called again and this inode glock will never be freed. Replacing AOP_TRUNCATED_PAGE with -ENOSYS will definitely not solve this and we'll end up with the same unfreed glock. Hope this clears things for you. Also, it's worth noting that the original bug 429343 was the assert in gfs_readpage. if (gfs_assert_warn(sdp, gfs_glock_is_locked_by_me(ip->i_gl))) { unlock_page(page); return -ENOSYS; } After the revert+change with the patch in comment #2, the only change we'll have is - if (gfs_assert_warn(sdp, gfs_glock_is_locked_by_me(ip->i_gl))) { + if (!gfs_glock_is_locked_by_me(ip->i_gl)) {
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0132.html