Red Hat Bugzilla – Bug 837155
jbd can't process 512B block size correctly, make system crash.
Last modified: 2014-06-02 09:21:53 EDT
Created attachment 595861 [details]
make jbd fit for 512B block size.
Description of problem:
Random crash system , when testing ocfs2 with 512B block size , even between 2 different mounts.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Need only a single node to reproduce this
1.mkfs.ocfs2 -b 512
2.mount , then dd some file on ocfs2 volume, then umount
3.do the test again with mkfs.ocfs2 -b 1024
I am sure this bug is about jbd.
I installed an OS with ext2 as root fs, so that I could remove the jbd module
together with ocfs2 module when unload .then do the test:
1.mkfs.ocfs2 -b 512
2.mount , then dd , then umount
3./etc/init.d/o2cb unload , this will also remove jbd
4.do the test again with mkfs.ocfs2 -b 1024
I hunt down the root cause:
1. JBD created a new slab queue when the first mount of ocfs2 with 512B blocksize, it name was calculated by 512 >> 11, that's 0, it used jbd_1k as name but 512 as the slab size without this patch.
2. this slab won't be destroied until the jbd module got removed.
3.Next time we mount the ocfs2 volume with 1K blocksize, it name still 1024 >> 11 , got "jbd_1k" , the same name with setp 1 , and the slab already created , but it's 512B, jbd continue use this slab as 1k this time, sure it will over write the memory, destroy the pointers. here it lead to the crash.
Red Hat does not support OCFS2, but this jbd bug might show up in ext3/4 as well so it is worth investigating.
The analysis of the bug certainly looks plausible.
For what it's worth, these slabs allocations were removed upstream so it's probably reasonable to put a little fix for this bad slab naming bug in RHEL.
Author: Mingming Cao <email@example.com>
Date: Tue Oct 16 18:38:25 2007 -0400
JBD: JBD slab allocation cleanups
JBD: Replace slab allocations with page allocations
Hmm, and Eric and I realized that ext* has a minimum block size of 1k which is almost certainly why jbd didn't correctly support 512B blocks.
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).