Bug 592513

Summary: generic fs tests leave secure NFSv4 mount point hang
Product: Red Hat Enterprise Linux 6 Reporter: Qian Cai <caiqian>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: anton, bfields, dougsland, gansalmon, itamar, jlayton, jonathan, kernel-maint, mvadkert, rwheeler, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 562055 Environment:
Last Closed: 2010-11-23 20:56:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 562055, 698855    
Bug Blocks:    

Description Qian Cai 2010-05-15 03:29:47 UTC
+++ This bug was initially created as a clone of Bug #562055 +++

Description of problem:
When running the test case here,
https://fedoraproject.org/wiki/QA:Testcase_nfs_generic_secure

After running the generic fs tests on the mount point for a while, the client behaved strangely, and commands like df, ps aux would hang,


# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_intels3e314401-lv_root
                      60084332   3182180  53850020   6% /
tmpfs                  4084324         0   4084324   0% /dev/shm
/dev/sda1               198337     58851    129246  32% /boot
<hung...>

Seen the following from the server's /var/log/messages,
Feb  5 00:00:52 amd-toonie2-01 kernel: RPC: AUTH_GSS upcall timed out.
Feb  5 00:00:52 amd-toonie2-01 kernel: Please check user daemon is running.
Feb  5 00:00:55 amd-toonie2-01 rpc.gssd[1342]: WARNING: can't create tcp rpc_clnt to server intel-s3e3144-01.rhts.eng.nay.redhat.com for user with uid 0: RPC: Remote system error - No route to host
Feb  5 00:00:56 amd-toonie2-01 rpc.gssd[1342]: WARNING: can't create tcp rpc_clnt to server intel-s3e3144-01.rhts.eng.nay.redhat.com for user with uid 0: RPC: Remote system error - No route to host
Feb  5 00:23:00 amd-toonie2-01 kernel: iint_free: readcount: 1
Feb  5 00:23:00 amd-toonie2-01 kernel: iint_free: writecount: -1

Version-Release number of selected component (if applicable):
nfs-utils-1.2.1-15.fc13
kernel-2.6.33-0.16.rc4.git6.fc13

How reproducible:
always

Steps to Reproduce:
https://fedoraproject.org/wiki/QA:Testcase_nfs_generic_secure
  
Actual results:
Tests were unable to complete due to mount point hang.

Expected results:
Tests can complete.

--- Additional comment from caiqian@redhat.com on 2010-02-05 01:51:08 EST ---

Created an attachment (id=388995)
sysrq-t output from the client when the test hang

--- Additional comment from mvadkert@redhat.com on 2010-02-05 08:31:55 EST ---

The same beaviour in my tests

--- Additional comment from fedora-triage-list@redhat.com on 2010-03-15 10:24:00 EDT ---


This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 1 RHEL Product and Program Management 2010-05-15 03:45:18 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 2 Jeff Layton 2010-05-17 11:33:19 UTC
Hmmm...looks more like some sort of generic networking problem -- i.e. problem creating or connecting the socket to the server:

Feb  5 00:00:55 amd-toonie2-01 rpc.gssd[1342]: WARNING: can't create tcp
rpc_clnt to server intel-s3e3144-01.rhts.eng.nay.redhat.com for user with uid
0: RPC: Remote system error - No route to host

...the kernel looks like it's doing the right thing here (hanging until the upcall starts responding).

Comment 4 Jeff Layton 2010-10-21 12:27:22 UTC
Is this still a problem in more recent RHEL6 builds?

Comment 5 Jeff Layton 2010-11-23 20:56:44 UTC
No response in quite some time and I've not heard of any problems along these lines since this bug was opened. I'm going to close with a resolution of CURRENTRELEASE under the assumption that this was fixed before release in some of the later kernels. Please reopen if that's not the case.