Bug 429102

Summary: Allocations on resume path can cause deadlock due to attempting to swap
Product: Red Hat Enterprise Linux 5 Reporter: Ian Campbell <ijc>
Component: kernel-xenAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: low Docs Contact:
Priority: low    
Version: 5.1CC: clalance, ddutile, pbonzini, xen-maint
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 20:41:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 514490    
Attachments:
Description Flags
linux-2.6.18-xen.hg 377:e8b49cfbdac0 backported to 2.6.18-53.1.4.el none

Description Ian Campbell 2008-01-17 10:57:20 UTC
Allocations made on the resume path, particularly in the blkfront reattach path,
can cause swap activity which cannot be performed because we are in the middle
of reattaching the swap disk. The solution is to use __GFP_HIGH on such
allocations which will use the emergency pool if necessary.

This was fixed upstream by linux-2.6.18-xen.hg 377:e8b49cfbdac0
http://hg.uk.xensource.com/linux-2.6.18-xen.hg?cs=e8b49cfbdac0

This issue effects RHEL5u1 (2.6.18-53.1.4.el) and RHEL4u6 (2.6.9-67.0.1.EL). I
will clone this issue into a RHEL4 issue as well.

To reproduce run a guest workload with large memory consumption (such as a
userspace memtest type application). The issue is seen after a few iterations of
save and restore, typically somrthing less than a dozen.

Comment 1 Ian Campbell 2008-01-17 10:57:20 UTC
Created attachment 291977 [details]
linux-2.6.18-xen.hg 377:e8b49cfbdac0 backported to 2.6.18-53.1.4.el

Comment 2 Ian Campbell 2008-01-17 11:00:37 UTC
hg.uk.xensource.com is an internal address, external address is
http://xenbits.xensource.com/linux-2.6.18-xen.hg?cs=e8b49cfbdac0.


Comment 3 Ian Campbell 2008-03-28 11:18:48 UTC
I'm afraid I've found a couple more instances of this issue:
http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/fdb998e79aba
http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/0637d22ed554

They apply unmodified to 2.6.18-53.1.14.el5.

The thread related one is a bit subtle: the xenbus_watch thread blocks with
xenbus_mutex held in kthread_create waiting for the completion to say the thread
has been spawned successfully. The thread is stuck waiting on IO due to an
attempt to swap while allocating memory in copy_process. This causes the suspend
process to block waiting for the xenbus_mutex and therefore the swap device
never gets attached.

These traces are from a linux-2.6.18-xen kernel but the code paths are the same.

suspend       D C02DF2F5     0 14792      1         14772 14790 (L-TLB)
       c20b5ea8 00000246 00000002 c02df2f5 00000008 c11ba000 00000000 c038aa00 
       c038aa00 00000000 c3e57660 00000000 00000000 00000009 c3e57550 89c61ba6 
       00023227 000004b3 c3e57660 c1101960 00000002 89b240b0 89c616f3 00023227 
Call Trace:
 [<c02df2f5>] __mutex_lock_slowpath+0xc5/0x2f0
 [<c02df528>] mutex_lock+0x8/0x10
 [<c025ef0f>] unregister_xenbus_watch+0x12f/0x1a0
 [<c025f82b>] free_otherend_watch+0x1b/0x40
 [<c025f869>] talk_to_otherend+0x19/0x40
 [<c02608aa>] resume_dev+0x2a/0xd0
 [<c0252d54>] bus_for_each_dev+0x54/0x80
 [<c02609e4>] xenbus_resume+0x44/0x50
 [<c025aa3a>] __xen_suspend+0x9a/0x110
 [<c025a1a8>] xen_suspend+0x68/0xd0
 [<c0102b55>] kernel_thread_helper+0x5/0x10

blocked waiting to lock xenwatch_mutex in unregister_xenbus_watch:
	/* Flush any currently-executing callback, unless we are it. :-) */
	if (current->pid != xenwatch_pid) {
		mutex_lock(&xenwatch_mutex);
		mutex_unlock(&xenwatch_mutex);
	}

the current holder is the xenwatch thread:

xenwatch      D C02DE102     0     9      7            10       (L-TLB)
       c11bbee8 00000246 00000002 c02de102 89c366df 00023227 c53e7200 c038aa00 
       c038aa00 00023227 89c54017 00023227 00000000 0000000a c11b6a70 89c5416d 
       00023227 00000f8d c11b6b80 c1101960 0000008f 00000000 89c531e0 00023227 
Call Trace:
 [<c02de102>] wait_for_completion+0x82/0xf0
 [<c0136c0c>] kthread_create+0x7c/0xd0
 [<c025f33b>] xenwatch_thread+0x10b/0x140
 [<c0136b86>] kthread+0x106/0x110
 [<c0102b55>] kernel_thread_helper+0x5/0x10

and the thread itself:

kthread       D C02DE736     0     7      1     9     758     6 (L-TLB)
       c11a9a60 00000246 00000002 c02de736 00000000 c11a9a08 00000003 c038aa00 
       c038aa00 c11a9ff8 c11bdf80 00000003 00000000 00000009 c1165550 89c616f3 
       00023227 0000d586 c1165660 c1101960 c01058b1 00000003 89c5416d 00023227 
Call Trace:
 [<c02de736>] io_schedule+0x26/0x30
 [<c02226aa>] get_request_wait+0xca/0x110
 [<c0223717>] __make_request+0x87/0x3b0
 [<c022141a>] generic_make_request+0xea/0x1b0
 [<c0223c8b>] submit_bio+0x6b/0x120
 [<c015f9ba>] swap_writepage+0x9a/0xc0
 [<c014f67a>] shrink_zone+0xefa/0x1080
 [<c014ff4a>] try_to_free_pages+0xca/0x1f0
 [<c014ad78>] __alloc_pages+0x178/0x2f0
 [<c01671fa>] cache_alloc_refill+0x2ea/0x590
 [<c0166eff>] kmem_cache_alloc+0x9f/0xb0
 [<c011e8c7>] copy_process+0x97/0x1240
 [<c011fd5b>] do_fork+0x6b/0x1c0
 [<c0102fdb>] kernel_thread+0x8b/0xa0
 [<c0136a27>] keventd_create_kthread+0x27/0x80
 [<c0132be5>] run_workqueue+0x75/0xf0
 [<c0133918>] worker_thread+0x138/0x160
 [<c0136b86>] kthread+0x106/0x110
 [<c0102b55>] kernel_thread_helper+0x5/0x10


Comment 4 Ian Campbell 2008-06-10 10:33:55 UTC
The fixes given above worked for me in practice however according to upstream
the correct fix is to use GFP_NOIO:
http://marc.info/?l=linux-kernel&m=121222807617695&w=2

This has been applied to the Xen kernel at:
http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/5db911a71eac

Comment 5 Don Dutile (Red Hat) 2008-11-20 19:53:57 UTC
Above patches applied to pending 5.3 release.

Comment 6 Ian Campbell 2009-12-03 14:37:43 UTC
It has taken me rather a long time (sorry about that) to notice that one of the patches in this ticket was not applied, http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/0637d22ed554 seems to be missing.

Comment 7 Don Dutile (Red Hat) 2009-12-03 16:15:50 UTC
Can you give me a test scenario to check this fix in RHEL5?

Comment 8 Ian Campbell 2009-12-03 17:14:33 UTC
IIRC it is simply necessary to suspend/resume or live migrate repeatedly while the guest is under heavy memory pressure and/or swapping heavily. It's a very rare occurance though and it was a while back so the deails are a bit hazzy -- comment #3 above is the best description I could find.

We have a userspace memtest type utility which is one of the work loads used in our testing which probably caused this issue.

Comment 9 Chris Lalancette 2009-12-07 13:19:47 UTC
(In reply to comment #6)
> It has taken me rather a long time (sorry about that) to notice that one of the
> patches in this ticket was not applied,
> http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/0637d22ed554 seems
> to be missing.  

Right, I see.  We took this patch into RHEL-4, but we missed it for RHEL-5.  OK, we'll add it to the list.

Thanks for the review,
Chris Lalancette

Comment 10 RHEL Program Management 2010-08-04 12:09:32 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Jarod Wilson 2010-09-03 19:05:04 UTC
in kernel-2.6.18-215.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 15 errata-xmlrpc 2011-01-13 20:41:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html