Bug 648854 - linux-2.6.18: netback: take net_schedule_list_lock when removing entry from net_schedule_list
Summary: linux-2.6.18: netback: take net_schedule_list_lock when removing entry from n...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.7
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 514489 KernelXenUpstream
TreeView+ depends on / blocked
 
Reported: 2010-11-02 10:11 UTC by Laszlo Ersek
Modified: 2011-07-21 10:27 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-21 10:27:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1065 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update 2011-07-21 09:21:37 UTC

Description Laszlo Ersek 2010-11-02 10:11:59 UTC
Quoting <4CCFD63D0200007800020366.novell.com> from xen-devel (http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00046.html):

"There is a race in net_tx_build_mops between checking if net_schedule_list is empty and actually dequeuing the first entry on the list. If another thread dequeues the only entry on the list during this window we crash because list_first_entry expects a non-empty list, like so: [...]"

Once upstream merges the patch in the email, backport it to RHEL5.7.

Comment 1 Laszlo Ersek 2010-11-03 12:46:56 UTC
Upstream commit: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/1045

Comment 2 Laszlo Ersek 2010-11-03 15:42:07 UTC
Another patch to backport in order to fix the compilation failure caused by the previous patch:
- msgid: 4CD1809B02000078000206A7.novell.com,
- URL: http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00127.html

(Not yet committed by upstream.)

Comment 3 Laszlo Ersek 2010-11-03 16:04:41 UTC
list_first_entry() is already in RHEL5's "include/linux/list.h" (twice, for that matter), so Comment 2 is superfluous.

Comment 4 RHEL Program Management 2011-02-01 16:51:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Jarod Wilson 2011-02-09 14:56:42 UTC
in kernel-2.6.18-243.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 8 Jinxin Zheng 2011-05-03 09:42:11 UTC
Could we possibly reproduce this issue on an unpatched kernel? How can we intentionally make the race condition happen?

Thanks.

Comment 9 Laszlo Ersek 2011-05-03 09:52:34 UTC
Hello Jinxin,

(In reply to comment #8)
> Could we possibly reproduce this issue on an unpatched kernel? How can we
> intentionally make the race condition happen?

Please follow the link under comment 5 and look at the "Testing" section in my message there -- you'll see that unfortunately I was unable to trigger the race, even though I tried to modify the kernel to achieve that.

Comment 10 Jinxin Zheng 2011-05-03 10:17:55 UTC
OK. Since this is hard to reproduce even on intentionally modified kernel, and our acceptance test against Xen shows that the patch didn't seem to cause any slowdown or other problems to the guest's network, I'll put this to VERIFIED if there's no harm to take this patch and it is believed to solve some potential problem.

Comment 11 errata-xmlrpc 2011-07-21 10:27:14 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html


Note You need to log in before you can comment on or make changes to this bug.