Bug 249738 - Autofs5 failed unmounting Solaris NFS server share
Summary: Autofs5 failed unmounting Solaris NFS server share
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: autofs
Version: 5.0
Hardware: All
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Ian Kent
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 426495
TreeView+ depends on / blocked
 
Reported: 2007-07-26 18:49 UTC by Simon Gao
Modified: 2008-05-21 14:37 UTC (History)
1 user (show)

Fixed In Version: RHBA-2008-0354
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 14:37:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
debug log and config information (5.69 KB, text/plain)
2007-07-26 18:49 UTC, Simon Gao
no flags Details
Patch to have umount_multi report statfs fail errno (735 bytes, patch)
2007-07-27 03:04 UTC, Ian Kent
no flags Details | Diff
Detailed debug log with patched autofs5 (31.91 KB, application/octet-stream)
2007-07-27 05:16 UTC, Simon Gao
no flags Details
Fix large file compile time dumbness (309 bytes, patch)
2007-07-27 06:34 UTC, Ian Kent
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0354 0 normal SHIPPED_LIVE autofs bug fix and enhancement update 2008-05-20 12:52:25 UTC

Description Simon Gao 2007-07-26 18:49:19 UTC
Autofs 5 failed auto unmount NFS shares served by Solaris 10 server. But works
with other NFS servers like NetApp and EMC.

Please see attached file for details.

Comment 1 Simon Gao 2007-07-26 18:49:19 UTC
Created attachment 160059 [details]
debug log and config information

Comment 2 Ian Kent 2007-07-27 03:02:28 UTC
Thanks for the bug report.

It's strange that the stat of /nfs/share1 fails.
The first thing we need to do is find out why this fails.

Could you apply the patch below so we can find out please.

Ian


Comment 3 Ian Kent 2007-07-27 03:04:25 UTC
Created attachment 160090 [details]
Patch to have umount_multi report statfs fail errno

Comment 4 Simon Gao 2007-07-27 05:16:49 UTC
Created attachment 160095 [details]
Detailed debug log with patched autofs5

Detailed logging after installing and starting patched autofs5.

Comment 5 Simon Gao 2007-07-27 05:20:29 UTC
The most relevant part is:

=================================================================================
Jul 26 21:50:07 client1 automount[32214]: st_expire: state 1 path /nfs
Jul 26 21:50:07 client1 automount[32214]: expire_proc: exp_proc = 3077839760
path /nfs
Jul 26 21:50:07 client1 automount[32214]: expire_proc_indirect: expire /nfs/share1
Jul 26 21:50:07 client1 automount[32214]: handle_packet: type = 4
Jul 26 21:50:07 client1 automount[32214]: handle_packet_expire_indirect: token
2050, name share1
Jul 26 21:50:07 client1 automount[32214]: expiring path /nfs/share1
Jul 26 21:50:07 client1 automount[32214]: umount_multi: path /nfs/share1 incl 1
Jul 26 21:50:07 client1 automount[32214]: umount_multi: could not stat fs of
/nfs/share1: Value too large for defined data type
Jul 26 21:50:07 client1 automount[32214]: do_expire: couldn't complete expire of
/nfs/share1
Jul 26 21:50:07 client1 automount[32214]: send_fail: token = 2050
Jul 26 21:50:07 client1 automount[32214]: expire_proc_indirect: 1 remaining in /nfs
Jul 26 21:50:07 client1 automount[32214]: mount still busy /nfs
Jul 26 21:50:07 client1 automount[32214]: expire_cleanup: got thid 3077839760
path /nfs stat 2
Jul 26 21:50:07 client1 automount[32214]: expire_cleanup: sigchld: exp
3077839760 finished, switching from 2 to 1
Jul 26 21:50:07 client1 automount[32214]: st_ready: st_ready(): state = 2 path /nfs
===============================================================================

/nfs/share1 is a 3.6TB ~ 3.8TB filesystem if the size of filesystem matters.

Simon

Comment 6 Simon Gao 2007-07-27 05:23:13 UTC
Is there a filesystem size limit for autofs5? How big is it if so?

Simon

Comment 7 Ian Kent 2007-07-27 05:54:56 UTC
(In reply to comment #6)
> Is there a filesystem size limit for autofs5? How big is it if so?

Looks like this is not an autofs problem.

It looks like statfs, on a 32 bit system checks to see if
the total blocks, free blocks, free blocks and the number
of file inodes and file nodes to see if any of these values
exceed the 32 bits for the fields the kernel needs to return.

Are any of the values larger than 32 bits?

Ian


Comment 8 Ian Kent 2007-07-27 06:34:25 UTC
Created attachment 160099 [details]
Fix large file compile time dumbness

Could you try this patch out please.
It seems to be fine with the small file systems
that I've tested against.

Ian

Comment 9 Ian Kent 2007-07-27 06:42:56 UTC
I think that the maximum file system size on Solaris 9
used to be 2TB and the maximum blocks would then be less
than (or equal to) 32 bits, for block sizes as small as
512 bytes. Correct me here if I'm wrong.

I guess that's changed with Solaris 10.

Ian


Comment 10 Simon Gao 2007-07-27 17:18:44 UTC
The new patch fixed the problem.
=================================================================================

Jul 27 09:57:24 client1 automount[8135]: st_expire: state 1 path /nfs
Jul 27 09:57:24 client1 automount[8135]: expire_proc: exp_proc = 3078425488 path
/nfs
Jul 27 09:57:24 client1 automount[8135]: expire_proc_indirect: expire /nfs/share1
Jul 27 09:57:24 client1 automount[8135]: handle_packet: type = 4
Jul 27 09:57:24 client1 automount[8135]: handle_packet_expire_indirect: token
2202, name share1
Jul 27 09:57:24 client1 automount[8135]: expiring path /nfs/share1
Jul 27 09:57:24 client1 automount[8135]: umount_multi: path /nfs/share1 incl 1
Jul 27 09:57:24 client1 automount[8135]: unmounting dir = /nfs/share1
Jul 27 09:57:24 client1 automount[8135]: rm_unwanted_fn: removing directory
/nfs/share1
Jul 27 09:57:24 client1 automount[8135]: expired /nfs/share1
Jul 27 09:57:24 client1 automount[8135]: send_ready: token = 2202
Jul 27 09:57:24 client1 automount[8135]: expire_cleanup: got thid 3078425488
path /nfs stat 0
Jul 27 09:57:24 client1 automount[8135]: expire_cleanup: sigchld: exp 3078425488
finished, switching from 2 to 1
Jul 27 09:57:24 client1 automount[8135]: st_ready: st_ready(): state = 2 path /nfs
===============================================================================

The problem only exists on 32bit RHEL 5 system. 64bit RHEL 5 does not have such
problem.

Can I go ahead apply the patch to my production machines now? Or should I wait
for further QA tests by you guys?

Sorry I failed mentioned that file system in question is a ZFS, a 128bit file
system from Sun. If you want to know more about it, here is a link:
http://www.sun.com/2004-0914/feature/

Thanks for fixing the problem so quickly.

Simon

Comment 11 Simon Gao 2007-08-27 21:03:05 UTC
Any updates on this bug? Is it fixed? Is there patches or updates going into
future autofs update for RHEL5?

Simon

Comment 12 Ian Kent 2007-08-28 03:06:17 UTC
(In reply to comment #11)
> Any updates on this bug? Is it fixed? Is there patches or updates going into
> future autofs update for RHEL5?

I added the patch upstream, to Rawhide and Fedora.
I've had one bug report saying that, with this update, a
reboot was needed. I really don't know if it's possible
to prevent that.

Ian

Comment 13 RHEL Program Management 2007-10-16 03:52:28 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 20 errata-xmlrpc 2008-05-21 14:37:29 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0354.html



Note You need to log in before you can comment on or make changes to this bug.