Bug 182970
Summary: | slab leaking (bio and biovec-1) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Charles Lopes <tjarls> | ||||||||
Component: | kernel | Assignee: | Dave Jones <davej> | ||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | rawhide | CC: | mbrancaleoni, oliva, pfrields, wtogami | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2006-03-02 04:12:55 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Charles Lopes
2006-02-24 20:12:20 UTC
Created attachment 125207 [details]
list of loaded modules
Created attachment 125208 [details]
/proc/meminfo
Created attachment 125209 [details]
/proc/slabinfo
*** Bug 183017 has been marked as a duplicate of this bug. *** FYI, to do some testing, I upgrade FC5-test to vanilla kernel 2.6.16-rc4 and the problem remains. Then downgraded to vanilla kernel 2.6.14.7 and the problems disappeared. So perhaps is a problem with bio on mainstream kernel? of course 2.6.14.7 doesn't work well on FC5 (mainly for a newer udev). I did some tests on 2.6.15-rcX kernels, and found out that until 2.6.15-rc5 all is ok, the things break on 2.6.15-rc6. Hope this can be useful. This patch: commit 3795bb0fc52fe2af2749f3ad2185cb9c90871ef8 Author: NeilBrown <neilb> Date: Mon Dec 12 02:39:16 2005 -0800 [PATCH] md: fix a use-after-free bug in raid1 Who would submit code with a FIXME like that in it !!!! Signed-off-by: Neil Brown <neilb> Signed-off-by: Andrew Morton <akpm> Signed-off-by: Linus Torvalds <torvalds> Causes the problem. I think this now should go to kernel developers. The patch correctly release the bio later in the function (before was freed and then used), but adds a IF(blah==NULL) to fire the bio_put ... I'm by no means a kernel expert, but why the IF is needed? The bio_put after the if statement doesn't look bad. You want to do it in the cases where the command "r1_bio->bios[mirror] = NULL;" was called previously. I believe the problem maybe due an exit point prior to the bio_put. Here's the piece of code: if (test_bit(R1BIO_BarrierRetry, &r1_bio->state)) { reschedule_retry(r1_bio); /* Don't dec_pending yet, we want to hold * the reference over the retry */ return 0; } A bio_put (with the test?) may be needed just before return. I'll give it a try tonight. mmmh so why previously wasn't done? I mean, previously bio_put was called and the bio used again, so they simply moved bio_put after bios usage (use after free problem). But why they added the IF? what if the statement != NULL ? bio_put is never called... so slab leaking. I rebuilt latest 2.6.15 kernel, just deleting the if statement, and the machine seems to not have a single problem right now. but again, I'm not a kernel expert, so perhaps the If is right but some other cases must be considered. should be fixed in tomorrows rawhide. (Grab it early from http://people.redhat.com/davej/kernels/Fedora/devel) *** Bug 183555 has been marked as a duplicate of this bug. *** It is definitely fixed in 2008, but today's rawhide still has 1996. |