Bug 848706

Summary: nfs commit feature has a dead lock bug in kernel (I have found the root cause, please see attachement for details).
Product: Red Hat Enterprise Linux 5 Reporter: Mitz Amano <mitz.amano>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED NOTABUG QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.8CC: jlayton, nfs-maint, rpacheco, rwheeler, steved
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-17 11:21:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
I have found the root cause of it, please see the details. none

Description Mitz Amano 2012-08-16 09:06:14 UTC
Created attachment 604834 [details]
I have found the root cause of it, please see the details.

Global Description
Bug Summary
1) Kernel Version:
a. Red Hat Version: 	kernel-2.6.18-308.8.2.el5 (which download from website)
b. Asianux Version:	kernel-2.6.18-308.3.AXS3 (merge our own patches into kernel-2.6.18-308.8.2.el5.src.rpm and rebuild)

2) Running Environment:
a. LTP (ltp-full-20100331.gz) stress test on local machine for 2-3 days, NFS file system is mounted on local machine.
b. The test script is /opt/ltp/ltpstress.sh
c. 4-8 CPU cores, 2.0-2.5GHz, 3.5-4GB RAM, 4-8GB swap space, 4-6GB disk free space.

3) Issue Appearance:
a. The processes (which processing nfs mounted files) are dead lock.
b. Cannot kill them with “kill -9” command.
c. Cannot shutdown machine in normal way.

Root Cause

1) It is NFS commit feature bug which is added firstly from kernel-2.6.18-308.4.1.el5.
a. The NFS commit operation need a lock for synchronizing.
b. It doesn’t consider that Kernel Memory Manager may call NFS commit operation indirectly through alloc_page function which is commonly used by many subsystem (including nfs and nfsd subsystem).
c. For some conditions, it will cause deadlock.

2) Now we have caught 3 conditions which cause deadlock: IssueX1, IssueX2, and IssueX3.


3)  We will introduce the IssueX1, IssueX2, and IssueX3 in this document (especially for IssueX2).


Please see attachment for details

Comment 1 Mitz Amano 2012-08-16 09:15:04 UTC
also can reference Bug 836095 (I originally sent, before one month). It is the same.

Comment 2 Ric Wheeler 2012-08-16 20:26:26 UTC
Hi Chen,

Have you described these issues upstream?  That is the best place to work to resolve issues unless you have a support contract with Red Hat or formal partner relationship.

Best regards,

Ric

Comment 3 Jeff Layton 2012-08-16 20:52:48 UTC
IIUC, the document spells out a deadlock that is possible if you mount via nfs a filesystem being exported from knfsd on the same host. That's an easily deadlockable situation for many reasons and is quite explicitly unsupported for that reason.

It's quite easy for the server to get into a situation where it needs memory in order to process a request from the client, so it tries to flush data to the server which can't do anything until that flush completes.

This is not just a deadlockable configuration for RHEL5, but for *any* version of the kernel.

I'm going to go ahead and close this WONTFIX. Please reopen if you wish to discuss it further.

Comment 4 Mitz Amano 2012-08-17 01:40:17 UTC
(In reply to comment #2)
> Hi Chen,
> 
> Have you described these issues upstream?  That is the best place to work to
> resolve issues unless you have a support contract with Red Hat or formal
> partner relationship.
> 
> Best regards,
> 
> Ric

It is running under Asianux test envrionments.

Asianux kernel is based on Red Hat kernel.

For RHEL5 (Asianux Server 3), we need merge our patches into kernel.

The test environments is described in the report document (please reference it).

(for stress test, we use ltpstress.sh, and it will do it under local machine).


hope these information is helpful for you.

Comment 5 Mitz Amano 2012-08-17 01:58:31 UTC
(In reply to comment #3)
> IIUC, the document spells out a deadlock that is possible if you mount via
> nfs a filesystem being exported from knfsd on the same host. That's an
> easily deadlockable situation for many reasons and is quite explicitly
> unsupported for that reason.
> 
> It's quite easy for the server to get into a situation where it needs memory
> in order to process a request from the client, so it tries to flush data to
> the server which can't do anything until that flush completes.
> 
> This is not just a deadlockable configuration for RHEL5, but for *any*
> version of the kernel.
> 
> I'm going to go ahead and close this WONTFIX. Please reopen if you wish to
> discuss it further.



Firtly,

sorry for my english not quite well (I do not know the meaning of "IIUC").



Secondly, 

1) I do not think that it can happen for *any* version of the kernel.

2) I have tested some versions of RHEL6 kernel (at least 5 versions) using the same test environments (almost same, at least), and not found this issue ("maybe" need spend longer time for stress testing ??).

3) As far as I know (from read the source code of nfs version 3 and nfsd version 3), I think the origianl nfs module "seems" (only seems) that it has considered the situation that knfsd mounted on local machine.

4) For RHEL5, all issues (although they are diferent) occured around of the commit features of nfs sub-system, not around of any other features of nfs sub-system.


At last:

Sorry, that I do not think it is suitable to mark this bug as CLOSED.

Comment 6 Jeff Layton 2012-08-17 11:21:49 UTC
(In reply to comment #5)
> 
> Firtly,
> 
> sorry for my english not quite well (I do not know the meaning of "IIUC").
> 

IIUC means "If I understand correctly".

> 
> 
> Secondly, 
> 
> 1) I do not think that it can happen for *any* version of the kernel.
> 

I've looked over all of the stack traces in your MS-Word document, and they all point to same root cause. You have configuration where the same host is acting as both NFS client and server. That's a configuration known to cause deadlocks.


> 2) I have tested some versions of RHEL6 kernel (at least 5 versions) using
> the same test environments (almost same, at least), and not found this issue
> ("maybe" need spend longer time for stress testing ??).
> 

Then you've been lucky (or unlucky). These sorts of deadlocks are just as possible there. You can minimize the chances of them by tuning the virtual memory settings in such a way as to keep knfsd out of direct reclaim, but it's still possible to deadlock.

> 3) As far as I know (from read the source code of nfs version 3 and nfsd
> version 3), I think the origianl nfs module "seems" (only seems) that it has
> considered the situation that knfsd mounted on local machine.
> 

I may have considered it, but it certainly isn't engineered in such a way as to do so under heavy stress. We occasionally use mounts over loopback for certain sorts of testing, but anything that stresses memory utilization is generally problematic for this reason.

> 4) For RHEL5, all issues (although they are diferent) occured around of the
> commit features of nfs sub-system, not around of any other features of nfs
> sub-system.
> 

Actually all 3 issues look pretty similar to me. The root cause is that you have knfsd in a situation where it needs to reclaim memory in order to proceed. It then tries to flush dirty NFS pages to itself, which can't work until it can allocate memory.

> 
> At last:
> 
> Sorry, that I do not think it is suitable to mark this bug as CLOSED.

I'm afraid I have to disagree here. This is not something we intend to address unless you can point to a deadlock that can occur when the client and server are separate hosts.

Comment 7 Ric Wheeler 2012-08-17 11:27:59 UTC
I also want to make clear that Red Hat as a company does not support clones of our product. If you want to get Red Hat to work with your company, you need to have a business relationship with us.

Of course, we are more than happy to work on problems in the upstream community. As Jeff stated, the configuration you describe is not supported (and not useful outside of testing as far as I can see).

Please take any further discussion to the upstream list and I would suggest only after you reproduce the deadlock with distinct client and servers as Jeff mentions above.

Thanks!

Comment 8 Mitz Amano 2012-08-17 12:53:30 UTC
(In reply to comment #7)
> I also want to make clear that Red Hat as a company does not support clones
> of our product. If you want to get Red Hat to work with your company, you
> need to have a business relationship with us.
> 

Sure, I think, you truly need not support another company which no relationship with Red Hat (such as Asianux).

I also agree, for Red Hat, support this issue maybe waste time resource.


> Of course, we are more than happy to work on problems in the upstream
> community. As Jeff stated, the configuration you describe is not supported
> (and not useful outside of testing as far as I can see).
>

> Please take any further discussion to the upstream list and I would suggest
> only after you reproduce the deadlock with distinct client and servers as
> Jeff mentions above.
> 
> Thanks!


Firstly:

   Thank you for still can take further discussion if I have new informations.



Secondly:

   If truly under distince client and server testing environments.

   For IssueX1 (Which is talked in my report document);
       I think it will still be a bug.
       It is still dead lock for Client side itself.
       I get a fix patch for it.
   Because I am not under distinct client and servers during original testing, so the conclusion is not suitable in report document.

   I hope Jeff Layton can help checking the patch which is in chapter "IssueX1" in report document, when he has time.



Thirdly:

   I will still read the source code of the details of nfs module on kernel-2.6.18-308.8.2.el5 to prove what Jeff Layton mentions 

   Also I need search relative information by google.

   It will truly be better that Jeff Layton can give some proof materials directly.



At last:

   I think one thing is sure at least (during our communication):

   For client & server on same machine, the nfs commit features improves the chances of nfs deadlock.

   Is it correct ?


thanks.

Comment 9 Ronald Pacheco 2012-08-17 14:34:10 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > I also want to make clear that Red Hat as a company does not support clones
> > of our product. If you want to get Red Hat to work with your company, you
> > need to have a business relationship with us.
> > 
> 
> Sure, I think, you truly need not support another company which no
> relationship with Red Hat (such as Asianux).
> 
> I also agree, for Red Hat, support this issue maybe waste time resource.
> 
> 
> > Of course, we are more than happy to work on problems in the upstream
> > community. As Jeff stated, the configuration you describe is not supported
> > (and not useful outside of testing as far as I can see).
> >
> 
> > Please take any further discussion to the upstream list and I would suggest
> > only after you reproduce the deadlock with distinct client and servers as
> > Jeff mentions above.
> > 
> > Thanks!
> 
> 
> Firstly:
> 
>    Thank you for still can take further discussion if I have new
> informations.
> 
> 
> 
> Secondly:
> 
>    If truly under distince client and server testing environments.
> 
>    For IssueX1 (Which is talked in my report document);
>        I think it will still be a bug.
>        It is still dead lock for Client side itself.
>        I get a fix patch for it.
>    Because I am not under distinct client and servers during original
> testing, so the conclusion is not suitable in report document.
> 
>    I hope Jeff Layton can help checking the patch which is in chapter
> "IssueX1" in report document, when he has time.
> 
> 
> 
> Thirdly:
> 
>    I will still read the source code of the details of nfs module on
> kernel-2.6.18-308.8.2.el5 to prove what Jeff Layton mentions 
> 
>    Also I need search relative information by google.
> 
>    It will truly be better that Jeff Layton can give some proof materials
> directly.
> 
> 
> 
> At last:
> 
>    I think one thing is sure at least (during our communication):
> 
>    For client & server on same machine, the nfs commit features improves the
> chances of nfs deadlock.
> 
>    Is it correct ?
> 
> 
> thanks.

Chen,

If you can reproduce this problem with stock RHEL, and document how to reproduce this with RHEL, then we can look at it.  If you can't reproduce it with RHEL, then please work it upstream.

Thanks and Regards,

Ron

Comment 10 Mitz Amano 2012-08-17 15:14:06 UTC
(In reply to comment #9)

> Chen,

> If you can reproduce this problem with stock RHEL, and
> document how to reproduce this with RHEL, then we can look at it.  If you
> can't reproduce it with RHEL, then please work it upstream.

> Thanks and
> Regards,

> Ron


Firstly:

   Thank you reply, and face the issue directly.


Secondly:

   Sorry for my English not quite well: is "stock RHEL" means it is the Red Hat release (Red Hat built by itself), and I need download the binary from the Website ?

   I truly download kernel-2.6.18-308.8.2.el5-x86_64.rpm from web site

   can reference the pictures in attachemnt of Bug 836095;

     it has "uname -a" command, display the full version string with build time;

     it also has "ps aux | grep D" command, which display the wait_uninterruptable processes.

     it also has current time display, so you can see the time spend.


Thirdly:

   In my report document, I mentioned the test environments (but maybe too simple, and need telling in details).

   But just like what Jeff Layton said, it is not difficult to repeat the issue.

   I will still work on it to prove what Jeff Layton said (maybe what he said is correct, but I need proof).

   I truly want Jef Layton to help checking my fix patch for IssueX1 (is it valid ?).


At last I hope:
  What I have done does not waste your time resources, and our five members(which have written comments under this bug number) truly can get some valuable results.


thanks.

Comment 11 Mitz Amano 2012-08-17 15:51:23 UTC
as you mentioned, I will also try under "upstream community" (sorry for I am not familiar with it, so I will spend a short time to familiar with it).

It will be better if you tell something to introduce it for me in short words, if you have time.

Comment 12 Mitz Amano 2012-09-11 02:19:35 UTC
Conclusion:
    Have get confirmation both from upstream and from LTP organization. the conclusion is truly what Jeff Layton said (what he said is correct).


1) For upstream community:

   I mail to linux-nfs.org and linux-kernel.org;
   only Jeff Layton reply, and no other guys reply during a week.
   "No reply during a week" means "auto confirm what Jeff Layton said".


2) For LTP organization:

   I mail to ltp-list.net;
   I got reply, but the reply is useless for the confirmation during a week.
   "No useful reply during a week" means "auto confirm what Jeff Layton said".


3) Next:

   I will change LTP features to match the test requirement of RHEL5 and RHEL6.
   If I find another "bug", I will come back to Red Hat Bugzilla to confirm it.
   If it is truly a bug, I will try to fix it, too.

thanks.

Comment 13 Linda Wang 2013-04-24 04:27:39 UTC
*** Bug 836095 has been marked as a duplicate of this bug. ***