Bug 446128 - gfs_controld: plock result write err 0 errno X
Summary: gfs_controld: plock result write err 0 errno X
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.2
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: David Teigland
QA Contact: GFS Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-12 21:22 UTC by Nate Straz
Modified: 2018-10-20 01:39 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 21:53:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0189 0 normal SHIPPED_LIVE cman bug-fix and enhancement update 2009-01-20 16:05:55 UTC

Description Nate Straz 2008-05-12 21:22:53 UTC
Description of problem:

When I try a plock operation on an NFS mounted GFS file system I get the
following message in /var/log/messages.  The plock operation does work.

Version-Release number of selected component (if applicable):
cman-2.0.84-2.el5
kernel-2.6.18-92.el5
kmod-gfs-0.1.23-5.el5


How reproducible:
100%

Steps to Reproduce:
A. On Server
 1. mkfs -t gfs -p lock_dlm -t ... /dev/foo
 2. mount -t gfs /dev/foo /mnt/foo
 3. exportfs -o rw client:/mnt/foo
 4. tail -f /var/log/messages
B. On client
 1. mount server:/mnt/foo /mnt/foo
 2. cd /mnt/foo
 3. xiogen -i 1 -F 10k:testfile | xdoio -k
  
Actual results:
May 12 15:37:59 newport gfs_controld[4835]: plock result write err 0 errno 2
May 12 15:38:16 newport gfs_controld[4835]: plock result write err 0 errno 2

May 12 16:13:29 tank-04 gfs_controld[26444]: plock result write err 0 errno 9
May 12 16:14:07 tank-04 last message repeated 3 times
May 12 16:21:31 tank-04 gfs_controld[26444]: plock result write err 0 errno 9


Expected results:
No "error" messages

Additional info:

Comment 1 David Teigland 2008-05-12 21:53:36 UTC
I'm assuming there are no errors reported for plocks requested on gfs directly?
I'm pretty sure this has to do with the way the source node of a request
is identified and the fact that the node/process identifiers change for plocks
arriving through nfs.


Comment 2 Nate Straz 2008-05-12 22:12:31 UTC
Correct, I ran the tests w/o NFS in the mix and I did not see any extra messages.

Comment 3 David Teigland 2008-05-13 19:14:26 UTC
The error is harmless apart from the annoying messages.  The kernel
is returning the wrong value from write(2) on the plock device
(0 instead of the number of bytes written).  Until the kernel is
fixed, this fix just removes the error message from gfs_controld.

commit on RHEL5 branch 685498d154acfff23e4af7bfe874a7b0ed2eb9c5
commit on STABLE2 branch a6b6a30358fd5e247a37e2fe493ef6a683174b66


Comment 4 Ozgur Akan 2008-07-07 17:30:56 UTC
I face the same problem. We have
RH 5.1, using gfs2
kernel 2.6.18-92.1.1.el5 
kmod-gfs-0.1.19-7.el5_1.3
cman-2.0.84-2.el5

I got the error message like every 15 minutes.

I am not sure if this problem is related with the error message I got but,  when
the number of usage increase the clients who exported this filesystem via nfs
just hangs. I also see gfs_controld uses %100 cpu on the node.

Comment 5 Andrew Neuschwander 2008-07-11 17:06:07 UTC
I have the same problem serving nfs from gfs(1) with 5.2 versions:

cman-2.0.84-2.el5
kernel-2.6.18-92.1.6.el5
kmod-gfs-0.1.23-5.el5

while the "plock result write err 0 errno 2" messge is most common, I also see
this these messages as well:

gfs_controld[5348]: plock result write err -1 errno 2
gfs_controld[5348]: plock result write err 0 errno 9
gfs_controld[5237]: plock result write err 0 errno 11
kernel: lock_dlm: gdlm_plock: vfs lock error file ffff81011e2748c0 fl \
ffff8100c2aa6ce0
kernel: lockd: grant for unknown block
kernel: gfs2 lock granted after lock request failed; dangling lock!

Comment 6 Frank Delahoyde 2008-08-04 16:18:23 UTC
(In reply to comment #5)
> I have the same problem serving nfs from gfs(1) with 5.2 versions:
> 
> cman-2.0.84-2.el5
> kernel-2.6.18-92.1.6.el5
> kmod-gfs-0.1.23-5.el5
> 
> while the "plock result write err 0 errno 2" messge is most common, I also see
> this these messages as well:
> 
> gfs_controld[5348]: plock result write err -1 errno 2
> gfs_controld[5348]: plock result write err 0 errno 9
> gfs_controld[5237]: plock result write err 0 errno 11
> kernel: lock_dlm: gdlm_plock: vfs lock error file ffff81011e2748c0 fl \
> ffff8100c2aa6ce0
> kernel: lockd: grant for unknown block
> kernel: gfs2 lock granted after lock request failed; dangling lock!

I have the same problem as Andrew, also with 5.2. Adding
<gfs_controld plock_rate_limit="0" plock_ownership="1"/>
to /etc/cluster/cluster.conf seemed to help the possibly-unrelated problem
of nfs clients appearing to hang when accessing the nfs-mounted gfs filesystem,
but access is still much slower than with 5.1.

Comment 7 Steven Lee 2008-08-27 17:39:46 UTC
Does this mean if the host exports the GFS file system via NFS, the NFS client will receive a lock request failure, even though the lock is eventually granted?

I'm seeing very bad Firefox 3.0 performance on NFS clients when ~/.mozilla is on a GFS file system, because sqlite failures on files like places.sqlite....  I am wondering if this is the cause.

Comment 8 David Teigland 2008-08-27 17:50:16 UTC
plocks work fine, this bz is just about a bad log message that's been removed
(see comment 3)

Comment 11 errata-xmlrpc 2009-01-20 21:53:08 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0189.html


Note You need to log in before you can comment on or make changes to this bug.