Bug 828521

Summary: Firefox profile in NFSv4 directory
Product: Red Hat Enterprise Linux 6 Reporter: Devin Bougie <devin.bougie>
Component: kernelAssignee: nfs-maint
Status: CLOSED WONTFIX QA Contact: Filesystem QE <fs-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2CC: bfields, fs-qe, kzhang, nfs-maint, rwheeler
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-12 21:29:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Devin Bougie 2012-06-04 20:28:18 UTC
Description of problem:
We see periodic file system hangs when a firefox profile is stored in an NFSv4 directory.


Version-Release number of selected component (if applicable):
Both client and server are fully updated RHEL6.2.


How reproducible:
Always


Steps to Reproduce:
1. Run firefox with the profile stored in an NFSv4 file system.  For example, mount a home directory using NFSv4.

  
Actual results:
Eventually the NFSv4 file system will wedge and all access to that FS from that client will block.  When this happens, "umount -f /file/system" will un-wedge the file system and everything continues where it left off.

[root@cesr3601 5475]# umount -f /home/rf_ctl 
umount2: Device or resource busy
umount: /home/rf_ctl: device is busy.
       (In some cases useful info about processes that use
        the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy


Expected results:
Firefox should run smoothly using NFSv4.


Additional info:
We find lots of reports of problems with NFSv4 home directories and firefox with
FC16 and Ubuntu, but none yet for RHEL6:

https://bugzilla.redhat.com/show_bug.cgi?id=732748
https://bugzilla.redhat.com/show_bug.cgi?id=811138
http://thread.gmane.org/gmane.linux.nfs/48690/focus=48705
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/974664

Firefox opens a bunch of sqlite files, sqlite uses flock to mediate
access, so this looks to be consistent with a problem w/flock and
NFSv4.  Doing strace on a hung firefox and then doing the 'umount -f'
to unhang shows it sitting in a futex which (presumably) gets woken
by the umount attempt:

futex(0x2b5710c9eab0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x2b5710c9eab0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x2b5710c9eab0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x2b5710d8ca4c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x2b570f606238, 89100) = 1
futex(0x2b5710c9eab0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x2b56fdd0c040, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x2b56fdd0c040, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x2b57083f630c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2b57083f6308, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1

Comment 1 Ric Wheeler 2012-06-04 20:44:04 UTC
Hi Devin,

This would be a great issue to work with Red Hat support to help you gather information/debug. Can you please open an official support ticket if you have formal support?

Best regards,

Ric

Comment 3 Devin Bougie 2012-11-12 15:15:37 UTC
Hello,

It looks like we resolved this by switching from UDP to TCP.  The historical reason for forcing the Linux systems to UDP--incompatibilities with Tru64 file servers--isn't so much of an issue anymore.

Thanks,
Devin

Comment 4 J. Bruce Fields 2012-11-12 21:29:56 UTC
Thanks for the report.

We don't support NFSv4 over UDP, so I'm going to mark this not a bug.

(Though we really should just turn it off completely--see bugs 606260 and 606263).

(It's also conceivable this would be reproduceable with v2 or v3 and UDP, in which case we might want to revisit this bug.  We discourage UDP even over v2 and v3, so that doesn't seem like a high priority for now.)