Bug 1294996

Summary: LibreOffice Cannot Open Files Over an NFS4 Mount, fcntl(F_SETLK F_WRLCK) hangs
Product: [Fedora] Fedora Reporter: David Ashley <w.david.ashley>
Component: kernelAssignee: nfs-maint
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 23CC: caolanm, dtardon, erack, gansalmon, itamar, jonathan, kernel-maint, ltinkl, madhu.chinakonda, mchehab, mstahl, sbergman, w.david.ashley
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-07 14:38:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
strace of soffice hang when opening nfs4 mounted file none

Description David Ashley 2015-12-31 16:04:43 UTC
Description of problem:

When attempting to open a file (any oo document) that resides on and NFS4 mount the application hangs at the oosplach process and never completes. File that reside on the machine locally can be successfully opened.

Version-Release number of selected component (if applicable):

The server is a F23 x86_64 machine that holds most of our user files. None of my F23 x96_64 workstations can open LibreOffice file that reside on the server. My F22 workstations can successfully open the same file.

How reproducible:

Very reproducible.

Steps to Reproduce:
1. Open LibreOffice Calc.
2. File->Open to a mounted file.
3. Attempt to open it. The application hangs.

Actual results:

The application hangs forever.

Expected results:

The file should be successfully opened.

Additional info:

All other applications I have tested can successfully open remote files so I assume that NFS4 or the network is not a part of the problem.

Comment 1 David Ashley 2016-01-02 19:54:43 UTC
When the application habgs the system monitor shows that LibreOffice is waiting on a pipe.

Comment 2 Stephan Bergmann 2016-01-04 12:27:24 UTC
Please start LibreOffice from a terminal shell as

  strace -f -t /usr/lib64/libreoffice/program/soffice.bin >log.txt 2>&1

(you can terminate it once it hangs with Ctrl-C in the terminal), and attach the resulting log.txt file.

Comment 3 David Ashley 2016-01-04 14:20:28 UTC
I have run the strace and found something strange. Running soffice.bin from the command line with or without strace actually works. What does not work is clicking on a document in the GUI or starting LibreOffice from the GUI and then attempting to open a new/existing document.

All the workstations only have the Cinnamon desktop installed. They do not have Gnome or KDE installed. It must be some interaction between Cinnamon and LibreOffice that is the problem, which makes sense because when it hags after double clicking on a document and the System Monitor reports it is waiting on a pipe.

Let me know if you need other information.

Comment 4 Stephan Bergmann 2016-01-04 15:15:37 UTC
(In reply to David Ashley from comment #3)
> I have run the strace and found something strange. Running soffice.bin from
> the command line with or without strace actually works. What does not work
> is clicking on a document in the GUI or starting LibreOffice from the GUI
> and then attempting to open a new/existing document.

Ah, right, running directly soffice.bin shortcuts setting the environment variable SAL_ENABLE_FILE_LOCKING=1 in the soffice wrapper script.  Please try

  strace -f -t /usr/lib64/libreoffice/program/soffice >log.txt 2>&1

instead (i.e., without the ".bin"), that should probably expose the problem.

Comment 5 David Ashley 2016-01-04 22:58:34 UTC
This is definitely a Cinnamon desktop problem. Everything works from the command line. But if I start LibreOffice from an icon, do a File->Open on a remote file, or doubleclick a remote file, then the application hangs.

If I start the application from a command line and then do a File->Open on a remote file the it works. It also works if I start the application from a command line and pass a remote file as a command line parameter.

Very strange.

Comment 6 David Ashley 2016-01-06 14:08:56 UTC
Created attachment 1112186 [details]
strace of soffice hang when opening nfs4 mounted file

I was finally able to get an strace of the hang and I have attached it. If you need anything else please let me know.

Comment 7 Stephan Bergmann 2016-01-06 14:29:31 UTC
The relevant part being

> [pid 26709] 08:03:21 access("/imports/marvin/dashley/coverletter.odt", F_OK) = 0
> [pid 26709] 08:03:21 lstat("/imports/marvin/dashley/coverletter.odt", {st_mode=S_IFREG|0664, st_size=22189, ...}) = 0
> [pid 26709] 08:03:21 open("/imports/marvin/dashley/coverletter.odt", O_RDWR|O_EXCL) = 33
> [pid 26709] 08:03:21 fstat(33, {st_mode=S_IFREG|0664, st_size=22189, ...}) = 0
> [pid 26709] 08:03:21 fcntl(33, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0} <unfinished ...>

and then never coming back.

Comment 8 Stephan Bergmann 2016-01-06 14:33:16 UTC
David, what exactly did you do to obtain that strace now, given your comment 5?

Comment 9 David Ashley 2016-01-06 14:46:45 UTC
First, I am still getting hangs somewhat randomly when opening remote files no matter how I start LibreOffice. I was too optimistic in my previous assessment.

For this trace, I ran your strace command. That brings up the main soffice window. From there I did a File->Open on a remote file and that causes the hang. This scenario sometimes works and sometimes not. Mostly it fails. However, it always works on local files. For that matter I can state flatly that it never hangs when opening a local file. Why it fails on NFS4 remote files is a mystery to me.

Let me know if you need anything else.

Comment 10 Stephan Bergmann 2016-01-07 08:58:58 UTC
* "When the application habgs the system monitor shows that LibreOffice is waiting on a pipe."  You mean some /tmp/OSL_PIPE_*_SingleOfficeIPC_*?  That's unrelated and expected:  The first LO instance starts listening on a well-known socket, so that if the user should start further LO instances, they can notify the original instance via that socket and terminate again, instead of running in parallel.


* One temporary workaround you can try is disabling LO's file locking by commenting out the line

  export SAL_ENABLE_FILE_LOCKING

by temporarily changing it to

  # export SAL_ENABLE_FILE_LOCKING

in /usr/lib64/libreoffice/program/soffice (you need to do that with root privileges, of course).  As explained at <http://opengrok.libreoffice.org/xref/core/readlicense_oo/docs/readme.xrm#152>:  "File locking is enabled by default in [LO].  On a network that uses the Network File System protocol (NFS), the locking daemon for NFS clients must be active.  To disable file locking, edit the soffice script and change the line 'export SAL_ENABLE_FILE_LOCKING' to '# export SAL_ENABLE_FILE_LOCKING'.  If you disable file locking, the write access of a document is not restricted to the user who first opens the document."

Comment 11 Stephan Bergmann 2016-01-07 09:04:32 UTC
As discussed in comment 7, a hanging fcntl(F_SETLK F_WRLCK) on an NFS4-mounted file.  Tentatively moving this to kernel, as there appears to be no more specific component that matches.  If necessary, please pass on as appropriate.

Comment 12 David Ashley 2016-01-07 14:13:33 UTC
This fix works for me. File locking is almost a non issue at my site.

Thanks for all your help. Feel free to close this bug.