Bug 680090

Summary: /usr/sbin/filefrag throws alignment errors on ARM
Product: [Fedora] Fedora Reporter: Gordan Bobic <gordan>
Component: e2fsprogsAssignee: Eric Sandeen <esandeen>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 13CC: adilger.redhat, esandeen, josef, kzak, oliver
Target Milestone: ---   
Target Release: ---   
Hardware: arm9   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-27 12:09:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Gordan Bobic 2011-02-24 10:39:34 UTC
Description of problem:
/usr/sbin/filefrag is throwing alignment errors on the ARM platform.


Version-Release number of selected component (if applicable):
e2fsprogs-1.41.10-6.fc13.armv5tel

How reproducible:
Every time.

Steps to Reproduce:
1. /usr/sbin/filefrag /path/to/any/file
  
Actual results:
# dmesg | grep filefrag
Alignment trap: filefrag (25700) PC=0x00008994 Instr=0xe1c240f8 Address=0xbed9c39c FSR 0x801

Expected results:
No alignment warnings.

Additional info:
Recent ARMv7 chips includes automatic alignment fix-up in hardware, so testing for this has to be done on ARMv6 or earlier.

Here is is core backtrace:

Program terminated with signal 7, Bus error.
#0  filefrag_fiemap (fd=0, blk_shift=<value optimized out>, num_extents=0x0)
    at filefrag.c:196
196			fiemap->fm_length = ~0ULL;
(gdb) backtrace
#0  filefrag_fiemap (fd=0, blk_shift=<value optimized out>, num_extents=0x0)
    at filefrag.c:196
#1  0x0000922c in frag_report (argc=<value optimized out>, 
    argv=<value optimized out>) at filefrag.c:336
#2  main (argc=<value optimized out>, argv=<value optimized out>)
    at filefrag.c:412

Comment 1 Gordan Bobic 2011-02-24 13:16:24 UTC
Verified that this happens with the package build from the latest rawhide src.rpm as well:
coreutils-8.10-4.fc13.armv5tel

Backtrace:

Program terminated with signal 7, Bus error.
#0  filefrag_fiemap (fd=0, blk_shift=<value optimized out>, num_extents=0x0)
    at filefrag.c:196
196	filefrag.c: No such file or directory.
	in filefrag.c
(gdb) backtrace
#0  filefrag_fiemap (fd=0, blk_shift=<value optimized out>, num_extents=0x0)
    at filefrag.c:196
#1  0x0000922c in frag_report (argc=<value optimized out>, 
    argv=<value optimized out>) at filefrag.c:336
#2  main (argc=<value optimized out>, argv=<value optimized out>)
    at filefrag.c:412

Comment 2 Gordan Bobic 2011-02-24 13:22:49 UTC
Oops, please disregard Comment 1. Wrong package. *facepalm*

Comment 3 Gordan Bobic 2011-02-24 14:06:56 UTC
Right, let's try that again. The problem still persists with the package built form the latest e2fsprogs from rawhide:
e2fsprogs-1.41.14-2.fc13.armv5tel

Program terminated with signal 7, Bus error.
#0  filefrag_fiemap (fd=0, blk_shift=12, num_extents=0x0) at filefrag.c:196
196			fiemap->fm_length = ~0ULL;
(gdb) backtrace
#0  filefrag_fiemap (fd=0, blk_shift=12, num_extents=0x0) at filefrag.c:196
#1  0x0000923c in frag_report (argc=<value optimized out>, 
    argv=<value optimized out>) at filefrag.c:336
#2  main (argc=<value optimized out>, argv=<value optimized out>)
    at filefrag.c:412

Comment 4 Eric Sandeen 2011-02-24 14:47:19 UTC
filefrag.c:196 is 

        do {
                fiemap->fm_length = ~0ULL;   <----- HERE
                fiemap->fm_flags = flags;
                fiemap->fm_extent_count = count;

oh you said that already ;)

struct fiemap looks like:

struct fiemap {
        __u64 fm_start;         /* logical offset (inclusive) at
                                 * which to start mapping (in) */
        __u64 fm_length;        /* logical length of mapping which
                                 * userspace wants (in) */
        __u32 fm_flags;         /* FIEMAP_FLAG_* flags for request (in/out) */
        __u32 fm_mapped_extents;/* number of extents that were mapped (out) */
        __u32 fm_extent_count;  /* size of fm_extents array (in) */
        __u32 fm_reserved;
        struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
};

so fm_length is 32- and 64-bit aligned within the structure... 

and fiemap comes from:

        char buf[4096] = "";
        struct fiemap *fiemap = (struct fiemap *)buf;


help me out here, can you decode the alignment warning for me so I have some idea why it's throwing the error?  Does this mean the stack variable buf is unaligned to start with?

Comment 5 Gordan Bobic 2011-02-24 15:29:31 UTC
My best guess is that buf isn't aligned to begin with. I am not 100% sure how GCC will align an array of char without __attribute__ ((aligned(4))). It is plausible it will align it the same as it would a char, i.e. 1 byte.

So something like this:

< char buf[4096] = "";
> char buf[4096] __attribute__ ((aligned(4))) = "";

should fix the problem.

Of course, the problem could be somewhere else. I'll see if this fixes it.

Comment 6 Gordan Bobic 2011-02-24 15:51:24 UTC
Yup, that fixes it. char buf[] wasn't aligned.

I couldn't help but notice that char buf[] type constructs are used all over the place. All char[] arrays that are getting structs cast into them are going to have the same problem.

So the solution (short of somehow avoiding casting structs around) would be to make sure all char[] buffers are declared with __attribute__ ((aligned(4))).

More information on the general nature of the problem is here (it refers to XScale but applies to all ARMs except those based on Cortex A8/A9):
http://lecs.cs.ucla.edu/wiki/index.php/XScale_alignment

Comment 7 Eric Sandeen 2011-02-24 19:06:00 UTC
Yeah, I've had to fix up some arm alignment stuff in the past ... sprinkling __aligned__ attributes around is pretty ugly.  :(  I'm not sure what a nice way to fix this might be...

Comment 8 Gordan Bobic 2011-02-24 20:39:40 UTC
Not to mention that force-aligning everything to 4 bytes isn't any more portable, either. On a 64-bit architecture you'd still get a misalignment. It may not be obvious there is a problem (e.g. x86 and ARM Cortex A8 and later have transparent misalignment fixup in hardware), but that doesn't mean there isn't a problem.

To leverage SIMD, you'd typically need it aligned to 16-bytes, but that's a different story.

Ignoring the SIMD aspect, however, you could declare all the buffers in terms of int and scale them according to sizeof(int). If buffers are always in pow(2) sizes, you could work them out in one place and set up variables like size_4096, and just define bug[size_4096]. Since int is always word sized, it should ensure that the arrays are optimally aligned on all architectures. It's not ideal (it wouldn't be obvious at a glance what the size of the buffer is), but it'd solve the alignment problem in a reasonably portable way.

Unlike some other compilers (e.g. ICC), GCC doesn't seem to have a compiler option to align all arrays to a 16-byte boundary, which would be a simpler fix.

Comment 9 Bug Zapper 2011-05-30 11:14:50 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Bug Zapper 2011-06-27 12:09:29 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 11 Andreas Dilger 2015-08-27 21:40:10 UTC
Note that this bug was fixed in the upstream e2fsprogs via commit v1.42.13-3-g59707c1 on the maint branch (so it will be in 1.42.14 if it is ever released) and for the 1.43 release.