Description of problem:
HP-UX NFS clients fail creating a new file on a CentOS 5 NFS server updated
with the 2.6.18-53.1.6.el5 when GFS is used as the backing filesystem.
Version-Release number of selected component (if applicable):
Using any HP-UX client.
I tested using both 11.00 and 11.23
Steps to Reproduce:
1. Create a GFS filesystem on the server
2. NFS export the filesystem
3. Mount on any hp-ux client
4. hp-ux$ cp anyfile /to/nfs/fs/on/rhas5
hp-ux$ cp: cannot create anyfile: Permission denied
HP-UX clients first create the new file using the NFS procedure CREATE (using
UNCHECKED and mode=0) and the server returns NFS3_EACCESS. I traced the -EACCES
error to the generic_permission kernel function. Apparently ext3 and xfs
filesystems do not use generic_permission but gfs does and it returns -EACCES to
I have created a simple patch to check for the -EACCES error and allow access
if the FSUID=Inode-UID. This resolves the problem but is probably not the best
way to fix the bug. I will attach the patch for reference. Hopefully those more
knowlegable than I can determine the correct fix needed to resolve the root
cause of this bug.
Created attachment 293760 [details]
Patch to nfsd_setattr to ignore -EACCES error caused by the GFS filesystem
I don't have access to an hp-ux box, so I have no good way to debug
this. However, I do know that the extended attribute code in GFS was
recently changed to remove the permission() calls for a different
reason. This is documented as bugzilla bug #323111, crosswritten from
GFS2 to GFS1. Since this is an internal bugzilla record, I don't know
if you will have permission to view it, and I don't have authority to
change that. However, the fix went into RHEL in kmod-gfs-0.1.21-1, and
Centos will probably be the same. Can you upgrade to that version or
higher and see if this is still a problem? Thanks.
How would I get this kmod-gfs-0.1.21-1?
I just did a yum update and I have the latest, kmod-gfs-0.1.16-5
I will test against this gfs module if I can get it...
Can you post the location of your yum repository? In other words, the
contents of /etc/yum.repos.d/RHEL5.repo or the Centos equivalent.
If there isn't a newer version, you can either wait for a new update
(I don't know how Centos works or how often they do updates)
or alternately, since you've done some experimentation, I suppose you
could compile the latest gfs from source code by doing something like
First, fetch the source code from CVS. Something like this:
cvs -d :pserver:sources.redhat.com:/cvs/cluster checkout -r RHEL5 cluster
Then compile by doing something like:
I don't recommend doing a make install, but you can manually load the
gfs module by doing insmod gfs.ko from that directory.
Then mount the file system and try to recreate the problem.
Of course, you'll first have to boot to the kernel that doesn't have
the circumvention patch you attached to see if it it's broken or fixed.
I should have mentioned this in my update, I have built a Red Hat AS 5 system
just to test and work on this bug. So I am using RHAS5, the release file lists;
Red Hat Enterprise Linux Server release 5.1 (Tikanga)
I installed 5.0 and then did a "yum update" to get current. When I ran the
update that brought me up to kernel-2.6.18-53.1.6.el5. I have not changed any of
the yum config files. The only thing I see on my system under /etc/yum.repos.d
is the single file rhel-debuginfo.repo, so I assume that I am using the default
Tested a current build from the CVS repo, problem still exists.
I used the cvs command given previously to pull the current cluster source.
Compiled it and reloaded the gfs module, dmesg showed;
GFS <CVS> (built Feb 6 2008 13:19:52) installed
So it would appear the new module actually loaded OK.
I mounted the GFS test filesystem, started NFSD and tested using the HP-UX NFS
Same results, the latest updates to the GFS module do not appear to address this
Thanks for checking that out. I'm trying to find/borrow an hp-ux client
machine within Red Hat so I can debug this. I've tried several channels
already and come up empty handed, but I'm not out of options yet.
In the mean time, it might help me if you collect an Ethereal / Wireshark
trace of this problem so I can analyze the requests that hp-ux is actually
sending to the NFS server. The smaller the trace, the easier it will be
for me to read.
I already have one, I should have attached it from the beginning.
Attaching the tcpdump file now...
Also, I added some simple printk debugging to the NFSD code in the kernel and
found it to be following this path (much detail not shown);
vfs_create - Note: the create returns OK
generic_permission - Note: returns a -EACCES error!
In case that might help...
Created attachment 294161 [details]
HP-UX NFS Client tcpdump trace
Here's the tcpdump I had from my testing against CentOS5. It should be the same
as my RHAS5 as it appears to have the same defect.
I finally got access to a real hp-ux machine and tried to recreate the
problem. It did not fail. The client was 11.23 on ia64:
# uname -a
HP-UX xxxxx B.11.23 U ia64 1756071376 unlimited-user license
For my server I used a RHEL5.2 prototype / pre-release system, i686.
I copied three different files over NFS of various sizes: 1K,
150K and 1.5MB. I verified on the server that the files were
copied successfully with no error messages and the contents were
I was running as root on both client and server, with no firewall
between them and no root squash. My /etc/exports looked like this:
[root@kool gfs]# cat /etc/exports
I haven't tried it on a true 5.1 server machine.
Dan, are you sure this wasn't a problem with selinux on the server
or various firewalls (iptables, etc.) interfering with the copy?
I am pretty much sure...
I normally configure my Linux NFS servers with iptables off and selinux disabled.
And, I was able to trace the NFS call into the nfsd layer in the kernel and down to the generic_permission()
kernel function. So I know the NFS CREATE client RPC is not being blocked.
If I recall, a simple cp of a file that does not yet exist on the NFS server using gfs filesystems is all that is
required to get the failure. I think a simple "mkdir" also fails.
I carefully looked at your test setup and I missed a detail on the first read...
It will NOT fail if you are running as root on the client. I knew that but I did
not read your posting carefully enough. Try it again using a regular account on
the HP-UX client.
By using a non-root user, I've recreated the problem using the hp-ux
machine. I've also compared what's happening against a Fedora 8 nfs
client with suggestions from Steve Dickson. The difference between
the two calls is this:
The Fedora 8 nfs client specifies a create mode of 2 (exclusive)
whereas the hp-ux client specifies a create mode of 0 (unchecked).
Now I'll backtrack how create mode 0 is handled by nfs and gfs as
opposed to ext3.
I tested the same set of commands using:
(1) gfs, (2) gfs2, (3) ext3, (4) xfs
All of them behaved the same way with the same client/server pair.
If the client is hp-ux and the user was not root, they gave permission
denied. If the user was root, they worked properly. If the client was
f8, they worked properly. So this looks to me like an nfs problem, not
a gfs problem.
I ran my tests again on my RHEL 5 server using both gfs and ext3. On my server
the hp-ux client fails when the backing store is;
gfs - fails as described
ext3 - no failure
Maybe your test system is using a ext3 update where they also started calling
generic_permission? Just a guess on my part...
Anyhow, it may really be a nfsd problem (that's where my simple patch was made)
that just surfaced with GFS first, and then in later as other filesystems were
I tried out an nfsd patch that had been posted for bug #432690,
but it didn't solve the problem. I also verified that it still fails
on ext3 with my NFS server. If what Dan says is true, that ext3 works
for him, then we may be dealing with two problems: one that's keeping
nfsd from working on my server, and one that's making gfs fail on Dan's.
Something you may want to consider...
In my limited research on this problem I discovered that this
generic_permission() was introduced starting in kernel 2.6.10 and that it was
recommended that all filesystems start using this function instead of older
methods (in the filesystem code I believe). I read that XFS was already changed,
but when I tested with RHEL5 it appeared that change was not yet made for XFS or
ext3. Just GFS was calling this generic_permission and when it returns with
-EACCES that causes the failure in the NFSD code to be returned.
So, you may want to verify if you are running more recent kernels than I am that
this generic_permission() has not been introduced into ext3, XFS, etc...
Just a thought, if it's true then RHEL5 will be broken for all HP-UX NFS clients
for more filesystems than just GFS as is my current case at 2.6.18-53.1.6.el5
Any update on this one?
I know Bob looked at it from a GFS viewpoint, and when it was found to also be
in other filesystems the problem was tagged to be a NFS kernel problem. Since
then no activity and I see Bob is still the "Assigned To" person. I was
expecting it to be reassigned to another kernel type person...
I think we're dealing with two problems here, but my progress is being
held up by the problem that keeps nfsd from working in this case on
ext3. Therefore, I'm reassigning to Steve D.
Still trying to locate a HP-UX client...
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Changing component from gfs2-kmod to kernel so that it stops showing
up on Kevin Anderson's list.
Updating PM score.
Has there been any update to this issue. I'm seeing this exact issue:
Client: HP-UX wren B.11.11 U 9000/889
NFS Server: Red Hat Enterprise Linux Server release 5.3 (Tikanga) - 2.6.18-128.1.6.el5 x86_64
GFS is the underlying filesystem for the NFS server.
*** This bug has been marked as a duplicate of bug 605720 ***