Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 495230

Summary: kernel dm: OOps in mempool_free when device removed
Product: Red Hat Enterprise Linux 5 Reporter: Milan Broz <mbroz>
Component: kernelAssignee: Milan Broz <mbroz>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: agk, dzickus, emcnabb, pvrabec
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 08:29:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Milan Broz 2009-04-10 15:12:40 UTC
Description of problem:

Basically this dupe of RHEL4 bug #465047 & backporting of commit
b35f8caa0890169000fec22902290d9a15274cbd

When a table is being replaced, it waits for I/O to complete
before destroying the mempool, but the endio function doesn't
call mempool_free() until after completing the bio.

Fix it by swapping the order of those two operations.

The same problem occurs in dm.c with md referenced after dec_pending.

That problem is very hard to reproduce, whith injected wait directly to kernel the OOps can look like this:
Pid: 29018, comm: loop0 Tainted: G      2.6.18-prep #1
RIP: 0010:[<ffffffff8002efe3>]  [<ffffffff8002efe3>] mempool_free+0x71/0x74
RSP: 0018:ffff81018a4efe88  EFLAGS: 00010212
RAX: 00000000766f6d65 RBX: ffff8101860c8440 RCX: 00000000c0000100
RDX: ffff81000101d480 RSI: 2d6d642f6b636f6c RDI: ffff810186315740
RBP: 0000000000000000 R08: ffff810186315740 R09: 00000000000000ff
R10: ffff8101ac0ccac0 R11: 636f6c623d4d4554 R12: ffff810186315740
R13: 0000000000000000 R14: 0000000000010a00 R15: 0000000000000001
FS:  00002b7e34a7b250(0000) GS:ffff810105fa49c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000329c540 CR3: 000000015b8e0000 CR4: 00000000000006e0
Process loop0 (pid: 29018, threadinfo ffff81018a4ee000, task ffff81017f8a00c0)
Stack:  ffffffff8861c6d2 ffff81019f37e000 ffff8101ac99c820 ffff81019f37e000
 ffff8101860c8440 0000000000000000 ffffffff88610522 0000100055557180
 ffff81019f37e130 0000000155557180 0000000000000000 ffffffff8861073d
Call Trace:
 [<ffffffff8861c6d2>] :dm_crypt:crypt_endio+0x83/0x8d
 [<ffffffff88610522>] :loop:loop_thread+0x370/0x396
 [<ffffffff8861073d>] :loop:do_lo_send_aops+0x0/0x172
 [<ffffffff800b4b4c>] audit_syscall_exit+0x31b/0x336
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff886101b2>] :loop:loop_thread+0x0/0x396
 [<ffffffff8005dfa7>] child_rip+0x0/0x11

Comment 1 Milan Broz 2009-04-10 15:13:59 UTC
(fixing bug typo)

> Basically this dupe of RHEL4 bug #456047

Comment 2 RHEL Program Management 2009-04-10 15:28:41 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 4 Don Zickus 2009-04-20 17:12:38 UTC
in kernel-2.6.18-140.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 8 errata-xmlrpc 2009-09-02 08:29:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html