Bug 439196
Summary: | pidof hangs in access_process_vm | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Janne Karhunen <jkarhune> | ||||||||||
Component: | kernel | Assignee: | Red Hat Kernel Manager <kernel-mgr> | ||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Martin Jenner <mjenner> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 4.6 | CC: | jbaron, lwang, pcfe, peterm, staubach | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2008-04-29 13:40:29 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Janne Karhunen
2008-03-27 14:53:40 UTC
Created attachment 299335 [details]
crash backtrace of pidof while hanging
Slow console & hw watchdog caused us not to get any sysrq-t's from this hardware but that is now solved. We should get complete task list from next occurrence. OK, thanks. BTW, it this reproducable on my system? The system is
obviously stuck here:
------------------------------------
int access_process_vm(...)
{
struct mm_struct *mm;
struct vm_area_struct *vma;
struct page *page;
void *old_buf = buf;
mm = get_task_mm(tsk);
if (!mm)
return 0;
>>> down_read(&mm->mmap_sem);
-------------------------------------
But there are hundreds of other down_write(...>mmap_sem); calls
on that architecture that could cause this problem...
Can you get an AltSysrq-T when this happens so I can see what the
processes that has the semaphore is doing???
Larry
Sysrq-t is in the works. With any luck we get it tomorrow morning. I'll try to make a reproducer testcase on one of the rhts systems. I'm feeling lucky.. Most basic imaginable testcase (killing/starting hordes of processes and making pidof against them) does not seem to reproduce it. I'm betting this may have something to do with the mount being done just prior to checking the existence of leftover nfs tasks. Just a guess though. Created attachment 300059 [details]
partial sysrq
This is tricky. WD is not the cause of the reset, it has to be something else. We can only get partial sysrq's. Created attachment 300244 [details]
Probably complete backtrace from Crash
Created attachment 300249 [details]
Another crash-bt
NOTE::: attachment id 300249 is verified to be from NON-RECOVERABLE occurrence. Umm, one of the tasks on top of NFS holds the semaphore that is required for NFS to start up :) ? I suspect this problem was introduced in linux-2.6.9-futex.patch: * Thu May 10 2007 Jason Baron <jbaron> [2.6.9-55.2] -fix for futex()/FUTEX_WAIT race condition (Ernie Petrides) [217067] Can you try kernel-2.6.9-55.1 and see it the problem goes away??? Larry Yeah, no (obvious) luck with the nfs guess. Hmm, imho my initial guess may still be valid. It may be that 11748 holds the write mmap_sem in sys_mmap holding just about everyone. That task may not be proceeding as NFS is not up and given that NFS startup is hanging in having 'pidof' waiting for that same semaphore, we have a deadlock. So we'll try both cases. We'll try removing the patch Larry suggested on one system and on another we'll move the pidof call to a point where basic NFS is already up. I'm willing to bet Larry a cup of machine coffee on this one :) It took a day to find second cluster for testing with 55.1 kernel, but we found one and will set up it on monday + start the test. System 1 has been testing the fix that moves pidof call to a point where nfsd/mountd are already up and the fault has not shown up yet. Given that this is the case implications of this are yet to be properly understood: it may mean that whole NFS failover concept is flaky, at least when it comes to having NFS client and server in the same node. We have not seen this bug again since pidof call was moved to a point when nfsd/mountd are already up. We'll keep the test running for another day to be 'sure'. Build with 55.1 kernel is also ready but has not been installed yet. Verified: does not show up after having moved the pidof call. Verified: using 55.1 kernel does not resolve this issue. OK, so what does this all mean? Is the whole failover logic flawed whe NFS is in the picture??? Larry To me it means that with bad luck tasks that are running on top of NFS mount may cause the system to deadlock when NFS server itself migrates to the same node. I take it not too many people are doing this.. To summarize: provided that we have NFS server migration from external host to local occurring at the same time when local NFS client task is holding mmap_sem we can have a deadlock. Pidof calls in NFS server startup iterate all tasks (proc/pid/cmdline) and they will stop once hitting this task: and this task is never proceeding as server is not coming back up. Larry, any major holes in this theory? |