Red Hat Bugzilla – Bug 446128
gfs_controld: plock result write err 0 errno X
Last modified: 2010-10-22 20:57:05 EDT
Description of problem:
When I try a plock operation on an NFS mounted GFS file system I get the
following message in /var/log/messages. The plock operation does work.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
A. On Server
1. mkfs -t gfs -p lock_dlm -t ... /dev/foo
2. mount -t gfs /dev/foo /mnt/foo
3. exportfs -o rw client:/mnt/foo
4. tail -f /var/log/messages
B. On client
1. mount server:/mnt/foo /mnt/foo
2. cd /mnt/foo
3. xiogen -i 1 -F 10k:testfile | xdoio -k
May 12 15:37:59 newport gfs_controld: plock result write err 0 errno 2
May 12 15:38:16 newport gfs_controld: plock result write err 0 errno 2
May 12 16:13:29 tank-04 gfs_controld: plock result write err 0 errno 9
May 12 16:14:07 tank-04 last message repeated 3 times
May 12 16:21:31 tank-04 gfs_controld: plock result write err 0 errno 9
No "error" messages
I'm assuming there are no errors reported for plocks requested on gfs directly?
I'm pretty sure this has to do with the way the source node of a request
is identified and the fact that the node/process identifiers change for plocks
arriving through nfs.
Correct, I ran the tests w/o NFS in the mix and I did not see any extra messages.
The error is harmless apart from the annoying messages. The kernel
is returning the wrong value from write(2) on the plock device
(0 instead of the number of bytes written). Until the kernel is
fixed, this fix just removes the error message from gfs_controld.
commit on RHEL5 branch 685498d154acfff23e4af7bfe874a7b0ed2eb9c5
commit on STABLE2 branch a6b6a30358fd5e247a37e2fe493ef6a683174b66
I face the same problem. We have
RH 5.1, using gfs2
I got the error message like every 15 minutes.
I am not sure if this problem is related with the error message I got but, when
the number of usage increase the clients who exported this filesystem via nfs
just hangs. I also see gfs_controld uses %100 cpu on the node.
I have the same problem serving nfs from gfs(1) with 5.2 versions:
while the "plock result write err 0 errno 2" messge is most common, I also see
this these messages as well:
gfs_controld: plock result write err -1 errno 2
gfs_controld: plock result write err 0 errno 9
gfs_controld: plock result write err 0 errno 11
kernel: lock_dlm: gdlm_plock: vfs lock error file ffff81011e2748c0 fl \
kernel: lockd: grant for unknown block
kernel: gfs2 lock granted after lock request failed; dangling lock!
(In reply to comment #5)
> I have the same problem serving nfs from gfs(1) with 5.2 versions:
> while the "plock result write err 0 errno 2" messge is most common, I also see
> this these messages as well:
> gfs_controld: plock result write err -1 errno 2
> gfs_controld: plock result write err 0 errno 9
> gfs_controld: plock result write err 0 errno 11
> kernel: lock_dlm: gdlm_plock: vfs lock error file ffff81011e2748c0 fl \
> kernel: lockd: grant for unknown block
> kernel: gfs2 lock granted after lock request failed; dangling lock!
I have the same problem as Andrew, also with 5.2. Adding
<gfs_controld plock_rate_limit="0" plock_ownership="1"/>
to /etc/cluster/cluster.conf seemed to help the possibly-unrelated problem
of nfs clients appearing to hang when accessing the nfs-mounted gfs filesystem,
but access is still much slower than with 5.1.
Does this mean if the host exports the GFS file system via NFS, the NFS client will receive a lock request failure, even though the lock is eventually granted?
I'm seeing very bad Firefox 3.0 performance on NFS clients when ~/.mozilla is on a GFS file system, because sqlite failures on files like places.sqlite.... I am wondering if this is the cause.
plocks work fine, this bz is just about a bad log message that's been removed
(see comment 3)
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.