Bug 762204 (GLUSTER-472)

Summary: OpenOffice fails on GlusterFS $HOME due to fuse_loc_fill error
Product: [Community] GlusterFS Reporter: Jeff Darcy <jdarcy>
Component: fuseAssignee: Csaba Henk <csaba>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 3.0.0CC: gluster-bugs, rabhat
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
proposed fix none

Description Jeff Darcy 2009-12-15 15:15:08 UTC
A user on IRC reported that he'd get a bus fault when trying to run OpenOffice when $HOME was on a GlusterFS filesystem.  I was able to reproduce this behavior on my own setup, and observed that it had to do with the following sequence of operations on a temp file: open(O_TRUNC), unlink, ftruncate (to 4K), mmap (read+write), mmap (read+exec).  It turns out that the ftruncate was failing with ENOENT, so the file was never actually extended and thus the expected page was never valid.

The reason that the ftruncate failed was that fuse_setattr saw a failure from fuse_loc_fill, which did fill in loc->inode but subsequently failed to fill in loc->path for the by-now-unlinked file.  My understanding of the code is that one or the other should be sufficient, with the inode preferred, so this should not cause an error.  I was able to verify that fixing fuse_setattr to work around this condition allowed OpenOffice to work, and didn't see any obvious problems in other quick testing, so I sent the user a patch to see if this improved his situation.

I haven't heard back yet, but I think this is a more fundamental problem with how fuse_loc_fill reports status and it affects many calls in fuse-bridge.c besides fuse_setattr.  I've generated a more comprehensive patch in which fuse_loc_fill reports success if it's able to fill in either the inode or the path, even if the other fails, and testing so far has revealed no new problems.  Is there any reason people know of why we couldn't adopt this approach generally?

Apologies if the component/priority/etc. aren't set correctly.  I'm not quite up on the local definitions or policies for such things.

Comment 1 Csaba Henk 2009-12-16 07:50:16 UTC
Hi Jeff,

Instead of fiddling with fuse_loc_fill

Comment 2 Csaba Henk 2009-12-16 07:55:12 UTC
Created attachment 120 [details]
Directprint script, required for functionality in printtool -- needs to be in /usr/lib/rhs/rhs-printfilters and chmod 0755

[sorry, accindetal premature enter pressure occured...]

Hi Jeff,

Instead of fiddling with fuse_loc_fill(), we propose to simply not call it when not needed.

Please check if the attached patch solves the problem.

Regards
Csaba

Comment 3 Jeff Darcy 2009-12-16 11:48:40 UTC
Your patch seems to work for me in the case where I had seen it before.  I am concerned, though, that the same basic issue affects more calls than just setattr - i.e. anything that could be called on an open but unlinked file.  Is the idea to apply this same logic to other functions?

Comment 4 Csaba Henk 2009-12-17 06:37:56 UTC
(In reply to comment #3)
> Your patch seems to work for me in the case where I had seen it before.  I am
> concerned, though, that the same basic issue affects more calls than just
> setattr - i.e. anything that could be called on an open but unlinked file.  Is
> the idea to apply this same logic to other functions?

Regarding long-term perspectives, I think the logic will be to weed out loc from those code paths which can do just with an fd. According to the internal API, those xlators who want to make use of a loc, they can count on the presence of path there (those involved in your tests didn't actually do that, that's another thing). 

That said, we'll be just glad if you can find some other scenario where an operation on an unlinked file b0rks. I checked now read/write, they seem to be fine.

Csaba

Comment 5 Vijay Bellur 2009-12-18 13:41:40 UTC
PATCH: http://patches.gluster.com/patch/2622 in master (fuse-bridge: Don't try to fill a loc in setattr when we can proceed on with an fd.)