Bug 805660 - System freezes during large NFS transfers after kernel-2.6.18-308.1.1.el5
Summary: System freezes during large NFS transfers after kernel-2.6.18-308.1.1.el5
Keywords:
Status: CLOSED DUPLICATE of bug 799941
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: nfs-utils
Version: 5.8
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: nfs-maint
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-21 18:35 UTC by Eric Schewe
Modified: 2012-06-12 18:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-12 18:48:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Eric Schewe 2012-03-21 18:35:17 UTC
Description of problem:

We have 1 RHEL5.8 server acting as a NFS server and 3 RHEL5.8 servers acting as clients. When the NFS clients backup jobs run for the evening they copy a large amount of data via NFS to the NFS server. All three NFS clients will freeze and the system is completely unresponsive. We have to power cycle the systems. We've been doing these backups for years and the problem only occured after updating to kernel 2.6.18-308.1.1.el5. We reverted all 3 client servers to kernel 2.6.18-274.18.1.el5 and the freezing stopped.

There are no problems with the NFS server which is still running kernel 2.6.18-308.1.1.el5.


Version-Release number of selected component (if applicable):
nfs-utils-1.0.9-60.el5
nfs-utils-lib-1.0.8-7.9.el5
kernel-2.6.18-308.1.1.el5

How reproducible:


Steps to Reproduce:
1. Boot system using kernel-2.6.18-308.1.1.el5
2. Execute backup script that copies the contents from the client NFS systems to the NFS server.
  
Actual results:
System freezes

Expected results:
System to remain functional and copy operation to complete

Additional info:
 NFS Clients /etc/fstab
  nfsserver.hostname:/      /mnt/nfsExport      nfs4    rw,_netdev,sync,hard,intr,rsize=8192,wsize=8192

 NFS Server /etc/fstab
  <local path>         <local mount>         none     bind    0 0

 NFS Server /etc/exports
  <local path>                   <remote host fqdn>(rw,sync,no_wdelay,all_squash,nohide,no_subtree_check,fsid=0)
  <local path>/backups           <remote host fqdn>(rw,sync,no_wdelay,no_root_squash,nohide,no_subtree_check)

Comment 1 Ric Wheeler 2012-04-17 20:04:35 UTC
Hi Eric,

If you have a Red Hat support contract, please work with our support team to help gather information, etc.

Thanks for the bug report!

Comment 2 Eric Schewe 2012-04-17 20:09:51 UTC
And if I do not?

Comment 3 jadavis6 2012-04-17 21:02:41 UTC
FWIW we (North Carolina Agricultural and Technical State) are having the same exact issue. We do have a support contract through the State of North Carolina and I'm working one finding a point of contact. I have network traces from the period of the hang I can attach if you want, but it may be a bit until UNC-GA gets back to me with how they want us to escalate this up to you guys.

Would it be possible to leave this Bugzilla open or would I have to create a new one?

Comment 4 Steve Dickson 2012-04-18 12:53:59 UTC
(In reply to comment #0)
> 
> 
> Steps to Reproduce:
> 1. Boot system using kernel-2.6.18-308.1.1.el5
> 2. Execute backup script that copies the contents from the client NFS systems
> to the NFS server.
> 
> Actual results:
> System freezes
> 
> Expected results:
> System to remain functional and copy operation to complete
> 
> Additional info:
>  NFS Clients /etc/fstab
>   nfsserver.hostname:/      /mnt/nfsExport      nfs4   
> rw,_netdev,sync,hard,intr,rsize=8192,wsize=8192
I'm noticing you are using v4.... Just curious does the freeze happen with v3?

> 
>  NFS Server /etc/fstab
>   <local path>         <local mount>         none     bind    0 0
> 
>  NFS Server /etc/exports
>   <local path>                   <remote host
> fqdn>(rw,sync,no_wdelay,all_squash,nohide,no_subtree_check,fsid=0)
>   <local path>/backups           <remote host
> fqdn>(rw,sync,no_wdelay,no_root_squash,nohide,no_subtree_check)
Is the server also running a 308.1.1.el5 kernel?

Comment 5 Ric Wheeler 2012-04-18 13:21:36 UTC
(In reply to comment #2)
> And if I do not?

If you don't have a customer support arrangement, the right thing to do is to follow up with the NFS community on the upstream lists or your (other) vendor.

Comment 6 Eric Schewe 2012-04-18 16:36:01 UTC
(In reply to comment #4)
> (In reply to comment #0)
> > 
> > 
> > Steps to Reproduce:
> > 1. Boot system using kernel-2.6.18-308.1.1.el5
> > 2. Execute backup script that copies the contents from the client NFS systems
> > to the NFS server.
> > 
> > Actual results:
> > System freezes
> > 
> > Expected results:
> > System to remain functional and copy operation to complete
> > 
> > Additional info:
> >  NFS Clients /etc/fstab
> >   nfsserver.hostname:/      /mnt/nfsExport      nfs4   
> > rw,_netdev,sync,hard,intr,rsize=8192,wsize=8192
> I'm noticing you are using v4.... Just curious does the freeze happen with v3?
> 
> > 
> >  NFS Server /etc/fstab
> >   <local path>         <local mount>         none     bind    0 0
> > 
> >  NFS Server /etc/exports
> >   <local path>                   <remote host
> > fqdn>(rw,sync,no_wdelay,all_squash,nohide,no_subtree_check,fsid=0)
> >   <local path>/backups           <remote host
> > fqdn>(rw,sync,no_wdelay,no_root_squash,nohide,no_subtree_check)
> Is the server also running a 308.1.1.el5 kernel?

We have not tried switching from v4 to v3. Since these systems are production and I don't have any spare, identical hardware I can't really experiment. I'll try using a VM and see if I can replicate the problem and then switch to V3.

The NFS server is still running 2.6.18-308.1.1.el5. We only had to revert NFS clients.

Comment 7 Eric Schewe 2012-04-30 18:58:23 UTC
I've changed the assignment of this from the kernel team to the nfs-utils team? Is that what you meant by "..the right thing to do is to
follow up with the NFS community on the upstream lists.."

Comment 8 Eric Schewe 2012-06-12 18:28:56 UTC
This issue appears to have been resolved as of kernel 2.6.18-308.4.1.el5 and 2.6.18-308.8.2.el5.

Comment 9 Jeff Layton 2012-06-12 18:48:13 UTC
Sounds likely that this is a duplicate of bug 799941.

*** This bug has been marked as a duplicate of bug 799941 ***


Note You need to log in before you can comment on or make changes to this bug.