Bug 1735480 - git clone fails on gluster volumes exported via nfs-ganesha
Summary: git clone fails on gluster volumes exported via nfs-ganesha
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: nfs-ganesha
Classification: Retired
Component: FSAL_GLUSTER
Version: 2.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kaleb KEITHLEY
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1543996 1751210 1753569
TreeView+ depends on / blocked
 
Reported: 2019-07-31 22:31 UTC by Chad Feller
Modified: 2020-06-24 11:55 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1753569 (view as bug list)
Environment:
Last Closed: 2020-06-24 11:55:11 UTC


Attachments (Terms of Use)

Description Chad Feller 2019-07-31 22:31:19 UTC
Description of problem:

'git clone' fails on Gluster volumes exported via nfs-ganesha (mounted via NFS). 'git clone' succeeds on the same volume exported natively (mounted via glusterfs).


Version-Release number of selected component (if applicable):
Server: 
CentOS 7

NFS Ganesha:
nfs-ganesha-2.7.6-1.el7.x86_64
nfs-ganesha-gluster-2.7.6-1.el7.x86_64

Gluster:
glusterfs-server-6.4-1.el7.x86_64
glusterfs-6.4-1.el7.x86_64

Client:
Debian 10

How reproducible:

Always

Steps to Reproduce:
1. mount -t nfs <gluster-server>:/gluster/vol /mnt/gluster
2. cd /mnt/gluster
3. git clone https://github.com/torvalds/linux.git

Actual results:
git clone https://github.com/torvalds/linux.git
Cloning into 'linux'...
remote: Enumerating objects: 1640, done.
remote: Counting objects: 100% (1640/1640), done.
remote: Compressing objects: 100% (881/881), done.
fatal: Unable to create temporary file '/mnt/gluster/linux/.git/objects/pack/tmp_pack_XXXXXX': Permission denied
fatal: index-pack failed


Expected results:
Successful git clone

Additional info:
This appears to be related to this:
https://github.com/nfs-ganesha/nfs-ganesha/issues/262

Which then reference a couple of other bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1543996
which also references this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1405147

Not sure if this is a separate issue, related, or a regression.

Comment 1 Soumya Koduri 2019-09-13 11:03:02 UTC
EACCESS error was returned on COMMIT operation. nfs-ganesha tries to open a fd as part of COMMIT operation which is denied by backend glusterfs server as the file perms are 0444.

Similar problem was posted recently in the upstream - https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/447012. The issue here is that 

- nfs-ganesha/fsal-gluster switches to user creds before performing any operations on the backend. [ This is needed to be able to run nfs-ganesha as non-root user ]
- And to perform any stateless I/O (like NFSv3 operations or NFSv4.x COMMIT operations), the ganesha server tries to open/use a global fd maintained for each file.

In this particular case,

- nfsv4 client created a file with permissions 0444 and using the same fd, some data is written into it.
- Now when it performs COMMIT operation to flush cached_data, ganesha server tries to open a new global fd which is denied by the gluster server as expected as the file has 0444 perms.

The fix for this issue is not trivial. Possible options we are exploring is to either

a) dup fd (when the file is OPENed with CREAT) and store it in global fd by taking extra ref so that it doesn't get flushed during CLOSE. This same fd if used in COMMIT can bypass access checks.

(or)
b) Frank suggested that we maintain a list of all open states of the file and for any stateless I/O, find a matching state (with same client and creds) and use it to perform I/O. This shall even help in enforcing share reservations for stateless I/O as well.

Need to check if we can use approach(a) as interim workaround till (b) gets done.

Comment 2 Soumya Koduri 2019-09-13 11:03:29 UTC
I have done some PoC to fix this particular case. There are multiple places where we need the fixes -


Issue1) Right now in FSAL_GLUSTER, we use "glfs_h_creat" to create handle and then "glfs_h_open" to fetch glfd. These two operations need to be combined into one atomic fop which shall create handle and also return glfd  to handle file creations with 0444 perms.

Fix: Add a new API "glfs_h_creat_glfd" for the same in libgfapi.

2) Sometimes NFS client seem to be opening a file twice without closing the first OPEN (first time with OPEN_SHARE_ACCESS_BOTH and second time with OPEN_SHARE_ACCESS_READ). In such cases NFS-Ganesha tries to reopen the file the second time which may fail with EPERM error

Fix: If the first OPEN state/fd contains the access needed for second OPEN, avoid re-opening the file.

3) As mentioned in above comment, as there is no state associated, COMMIT operation tries to re-open and obtain globalfd which fails with EPERM.

Approach taken: Dup the glfd returned in OPEN operation and store it as globalfd. Dup will make sure to take extra ref while this new glfd/globalfd shall get closed as part of lru purge or file removal.

Comment 4 Matheus Morais 2019-10-13 04:30:52 UTC
Hi,

I'm using the following version of NFS Ganesha/GlusterFS and facing this bug:

nfs-ganesha-gluster-2.8.2-1.el7.x86_64
nfs-ganesha-2.8.2-1.el7.x86_64
glusterfs-server-6.5-1.el7.x86_64
glusterfs-6.5-1.el7.x86_64

There is any workaround we can do to this issue?

There are plans to roll this fix to 2.8 versions or it will be on the latest Ganesha release (3.0)?

Thanks,

Matheus Morais

Comment 5 Pasi Karkkainen 2019-10-13 19:25:03 UTC
It'd be nice to have the fixes backported to nfs-ganesha 2.8.x branch aswell, for the 2.8.3 release.

Comment 6 Matheus Morais 2019-10-14 14:03:02 UTC
Yes, that would be great.

I can help to test if those patches fix the problem on 2.8.

@Soumya, can you please provide a patch on latest 2.8 branch?

Thanks.

Comment 7 Matheus Morais 2019-10-15 15:21:23 UTC
I did a backport of Soumya patches to GlusterFS 6.5 and nfs-ganesha 2.8 and can confirm that it solve the problem on that specific versions.

Comment 8 Nicolas Derive 2019-11-04 12:33:19 UTC
(In reply to Matheus Morais from comment #7)
> I did a backport of Soumya patches to GlusterFS 6.5 and nfs-ganesha 2.8 and
> can confirm that it solve the problem on that specific versions.

Do you have RPMs or source somewhere for your backport?

Thanks.

Comment 9 Kaleb KEITHLEY 2020-06-24 11:55:11 UTC
If this is still an issue please open an issue in the github tracker at https://github.com/nfs-ganesha/nfs-ganesha/issues


Note You need to log in before you can comment on or make changes to this bug.