For the OLPC project, we're trying to reduce the kernel code size. One way of doing this is to build kernel objects with '-fwhole-program --combine' Unfortunately, while just '-fwhole-program' works if we use the ugly trick of a C file which #includes all the other C files, we have problems with '--combine'. It doesn't manage to correctly notice that unions containing anonymous structures are identical, so it thinks that function prototypes involving those unions are 'confliction' -- GCC PR #27898. It also complains about global register variables which each file defines identially -- GCC PR #27899 Adding to Red Hat bugzilla so that it can be added to the OLPC tracker bug.
For reference, this is the kind of improvement in size we get, by using -fwhole-program and the ugly #include trick: text data bss dec hex filename 117428 7832 320 125580 1ea8c jffs2-allinone.o 122192 9056 328 131576 201f8 fs/jffs2/jffs2.o We really can't do the #include trick for anything more than testing though -- we need to have --combine working before we can sensibly do this for real.
Alex provided a patch for PR27898 -- http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00187.html Is this in mainline yet? In our current packages?
Created attachment 134096 [details] Test case on PowerPC This is an example of the kind of thing we actually want to do -- build a whole subdirectory with --combine. Even with Alex's patch for PR27898 it fails in various ways... include/asm/paca.h:23: warning: register used for two global register variables In file included from include/linux/sched.h:51, from fs/jffs2/build.c:16: include/linux/rbtree.h:139: error: conflicting types for ârb_insert_colorâ include/linux/rbtree.h:139: error: previous declaration of ârb_insert_colorâ was here include/linux/rbtree.h:140: error: conflicting types for ârb_eraseâ include/linux/rbtree.h:140: error: previous declaration of ârb_eraseâ was here include/linux/rbtree.h:143: error: conflicting types for ârb_nextâ include/linux/rbtree.h:143: error: previous declaration of ârb_nextâ was here include/linux/rbtree.h:144: error: conflicting types for ârb_prevâ include/linux/rbtree.h:144: error: previous declaration of ârb_prevâ was here include/linux/rbtree.h:145: error: conflicting types for ârb_firstâ include/linux/rbtree.h:145: error: previous declaration of ârb_firstâ was here include/linux/rbtree.h:146: error: conflicting types for ârb_lastâ include/linux/rbtree.h:146: error: previous declaration of ârb_lastâ was here include/linux/rbtree.h:150: error: conflicting types for ârb_replace_nodeâ include/linux/rbtree.h:150: error: previous declaration of ârb_replace_nodeâ was here include/linux/irq_cpustat.h:20: error: conflicting types for âirq_statâ include/linux/irq_cpustat.h:20: error: previous declaration of âirq_statâ was here include/linux/fs.h:460: error: conflicting types for âmapping_taggedâ include/linux/fs.h:460: error: previous declaration of âmapping_taggedâ was here include/linux/fs.h:993: error: conflicting types for âgeneric_osync_inodeâ include/linux/fs.h:993: error: previous declaration of âgeneric_osync_inodeâ was here ...
Created attachment 134097 [details] Test case on i386 Similar test case on i386. Note interesting behaviour when just trying to combine read.c and write.c: /opt/crosstool/gcc-4.1.0-glibc-2.3.6/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-gcc -m32 -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -fasynchronous-unwind-tables -g -msoft-float -mpreferred-stack-boundary=2 -mregparm=3 -ffreestanding -c -o readwrite.o read.i write.i --combine In file included from include/linux/thread_info.h:22, from include/linux/preempt.h:11, from include/linux/spinlock.h:51, from include/linux/wait.h:26, from include/linux/fs.h:224, from fs/jffs2/write.c:16: include/asm/thread_info.h:96: warning: register used for two global register variables fs/jffs2/write.c:712: internal compiler error: in splice_child_die, at dwarf2out.c:5492 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions. That may be my self-built cross-compiler though.
Much of what I quoted in comment #3 seems to be explained by GCC PR28706.
If I remove the alignment on struct rb_node from read.i and write.i I can reproduce the failure mode of comment #4 on my PowerPC native build... pmac /pmac/git/jffs2-play-2.6/jffs2-combine-ppc $ cc -m64 -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -c -o readwrite.o read.i write.i --combine -Daligned\(x\)= In file included from include/asm/current.h:16, from include/linux/wait.h:27, from include/linux/fs.h:223, from fs/jffs2/write.c:16: include/asm/paca.h:23: warning: register used for two global register variables fs/jffs2/write.c:673: internal compiler error: in splice_child_die, at dwarf2out.c:5492 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugzilla.redhat.com/bugzilla> for instructions. Removing '-g' from the compiler flags unsurprisingly fixes it. This is current (4.1.1-1.fc5) Fedora 5 GCC with Alex's patch for PR27898 applied.
Passing every .i file through s/aligned/ignore/g actually makes it work now on PPC. When combined with -fwhole-program I get object files which are 20% smaller than the original... 120552 9000 328 129880 1fb58 jffs2-normal.o 106688 7032 304 114024 1bd68 jffs2-combine.o 97332 5824 304 103460 19424 jffs2-combine-whole-program.o (Including all files from one and using -fwhole-program was only getting me a 5% reduction, which seems slightly more reasonable -- and that's still about the same if I do the same s/aligned/ignore/ and build the same way. I'll have to test the '--combine -fwhole-program' version once PR28706 is fixed and see if it actually works.) The i386 version still has (at least) one problem: include/asm/processor.h:100: error: conflicting types for 'doublefault_tss' include/asm/processor.h:100: error: previous declaration of 'doublefault_tss' was here include/asm/processor.h:101: error: conflicting types for 'per_cpu__init_tss' include/asm/processor.h:101: error: previous declaration of 'per_cpu__init_tss' was here
> The i386 version still has (at least) one problem: > include/asm/processor.h:100: error: conflicting types for 'doublefault_tss' > include/asm/processor.h:100: error: previous declaration of 'doublefault_tss' > was here > include/asm/processor.h:101: error: conflicting types for 'per_cpu__init_tss' > include/asm/processor.h:101: error: previous declaration of 'per_cpu__init_tss' > was here That's PR27812.
I've posted a patch for PR287{06,12}.
Perfect, thanks. With your patch and Alex's applied, I can now build modules with only the 'register used for two global register variables' warnings (PR27899), which I believe is harmless. The standard build, for example, of fs/jffs2 on i386 (gcc 4.1.0) gives me this: 80179 688 176 81043 13c93 fs/jffs2/built-in.o With -fwhole-program --combine I get this: 77555 688 188 78431 1325f fs/jffs2/built-in.o Strangely, that latter result is the same whether I add -funit-at-a-time or -fno-unit-at-a-time to CFLAGS.
If you still have issues with rawhide gcc, please reopen. The global reg warning isn't solved there, but it can be ignored I guess.