Bug 451068
Summary: | ext3: oops in do_split, miscompilation with gcc 4.3.1 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Eric Sandeen <esandeen> | ||||||||||||||
Component: | gcc | Assignee: | Jakub Jelinek <jakub> | ||||||||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | rawhide | CC: | arekm, atkac, clumens, jmccann, jmtaylor90, katzj, petersen, redwolfe, sangu.fedora, thethirddoorontheleft, yaneti | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | 4.3.1-3 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2008-06-26 04:43:42 UTC | Type: | --- | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
Eric Sandeen
2008-06-12 17:01:47 UTC
Created attachment 309103 [details]
first part of oops
Created attachment 309105 [details]
2nd part of oops
Created attachment 309106 [details]
3rd part of oops
actually I'll take this, I think it's my fault and I can reproduce it :) I had a hunch that it might be gcc's fault; all the oopsing kernels were built on shiny new 4.3.1, I tested 4.3.0 and had no problems. Thanks to Roland for all his help looking into this one.... <roland> the bug is that for ptr[-1].size it went from *(short*)&ptr[-1].size to *(long*)&ptr[-1].size <roland> it's gcc's fault I'll get a proper gcc bug report filed tonight or tomorrow... in the meantime looks like gcc 4.3.1 in rawhide is slightly busted... -Eric This is with: [root@magnesium ~]# rpm -q gcc gcc-4.3.1-1.i386 [root@magnesium ~]# gcc -v Using built-in specs. Target: i386-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-cpu=generic --build=i386-redhat-linux Thread model: posix gcc version 4.3.1 20080609 (Red Hat 4.3.1-1) (GCC) Created attachment 309166 [details]
preprocessed namei.i from 2.6.26-0.57.rc5.git3.fc10.i686
Created attachment 309167 [details]
do_split disassembly from 4.3.0
Created attachment 309168 [details]
do_split disassembly from 4.3.1
The interesting bit: for (i = count-1; i >= 0; i--) { /* is more than half of this entry in 2nd half of the block? */ if (size + map[i].size/2 > blocksize/2) 906: 8b 7d a0 mov -0x60(%ebp),%edi 909: 31 f6 xor %esi,%esi 90b: 31 d2 xor %edx,%edx 90d: 8b 45 d4 mov -0x2c(%ebp),%eax 910: 8b 5d 98 mov -0x68(%ebp),%ebx 913: d1 ef shr %edi 915: 8d 4c 18 fe lea -0x2(%eax,%ebx,1),%ecx 919: 66 8b 19 mov (%ecx),%bx The only difference between compilers seems to be %bx vs. %ebx on this last line. map[i].size is a u16, and it looks like what is happening is that if it loads 4 bytes instead of 2, it crosses the page boundary and we go "BUG: unable to handle kernel paging request at <first byte in next page>" Thanks, -Eric What exact gcc options were used to compile namei.i? Sorry, knew I was forgetting something: gcc -Wp,-MD,/root/ext3/.namei.o.d -nostdinc -isystem /usr/lib/gcc/i386-redhat-linux/4.3.1/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -fno-stack-protector -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 -mtune=generic -mtune=generic -ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Iinclude/asm-x86/mach-generic -Iinclude/asm-x86/mach-default -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(namei)" -D"KBUILD_MODNAME=KBUILD_STR(ext3)" -c -o /root/ext3/namei.o /root/ext3/namei.c Ah that was namei.o; here's namei.i just to be exact about what you asked: gcc -E -Wp,-MD,/root/ext3/.namei.i.d -nostdinc -isystem /usr/lib/gcc/i386-redhat-linux/4.3.1/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -fno-stack-protector -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 -mtune=generic -mtune=generic -ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Iinclude/asm-x86/mach-generic -Iinclude/asm-x86/mach-default -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(namei)" -D"KBUILD_MODNAME=KBUILD_STR(ext3)" -o /root/ext3/namei.i /root/ext3/namei.c *** Bug 451573 has been marked as a duplicate of this bug. *** *** Bug 451546 has been marked as a duplicate of this bug. *** *** Bug 451487 has been marked as a duplicate of this bug. *** Jakub, any ETA on a fix for this? Should we un-tag gcc 4.3.1 from rawhide for now? Thanks, -Eric meanwhile, as a workaround for rawhide installs, use ext2 instead of ext3 or ext4 it hits the ext4 filesystems as well. Actually any ext* filesystem which enables the dir_index feature is likely susceptible; another workaround would be to turn this feature off. -Eric Should be fixed in gcc-4.3.1-3. WORKSFORME, I rebuilt the latest kernel w/ this version, did a big yum update, no problems. I think 2.6.26-0.93.rc8.fc10 should be the first kernel built with this. Thanks! -Eric |