Bug 485067 - rawhide kernel.ppc64 fails to boot on ppc
Summary: rawhide kernel.ppc64 fails to boot on ppc
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: rawhide
Hardware: ppc64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F11Beta, F11BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2009-02-11 13:56 UTC by James Laska
Modified: 2013-09-02 06:31 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-25 16:30:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
F-11-Alpha-ppc64 - boot.log (3.06 KB, text/plain)
2009-02-11 13:56 UTC, James Laska
no flags Details
objdump -d output of gcc 4.3 prom_init.o (151.67 KB, text/plain)
2009-02-17 18:44 UTC, Josh Boyer
no flags Details
objdump -d output of gcc 4.4 prom_init.o (152.89 KB, text/plain)
2009-02-17 18:44 UTC, Josh Boyer
no flags Details
ice output (776.00 KB, text/plain)
2009-02-17 20:51 UTC, Josh Boyer
no flags Details
ice output2 (953.78 KB, application/octet-stream)
2009-02-17 20:52 UTC, Josh Boyer
no flags Details
prom_init.i output from gcc 4.4 (753.52 KB, text/plain)
2009-02-18 16:09 UTC, Josh Boyer
no flags Details
Instrumented prom_init.c (65.51 KB, text/plain)
2009-02-18 19:24 UTC, Josh Boyer
no flags Details
objdump -d output of good prom_init.o (162.55 KB, text/plain)
2009-02-18 22:09 UTC, Josh Boyer
no flags Details
objdump -d output of failing prom_init.o (156.12 KB, text/plain)
2009-02-18 22:10 UTC, Josh Boyer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 39226 0 P2 RESOLVED [4.4/4.5 Regression] gcc_assert (verify_initial_elim_offsets ()); ICE 2021-01-28 17:54:36 UTC

Description James Laska 2009-02-11 13:56:22 UTC
Created attachment 331563 [details]
F-11-Alpha-ppc64 - boot.log

Description of problem:

Latest rawhide kernel.ppc64 fails to boot on power5 ppc system.  F-11-Alpha worked.

Version-Release number of selected component (if applicable):

kernel-2.6.29-0.99.rc4.ppc64

How reproducible:

Every time

Steps to Reproduce:
1. Install F-11-Alpha
2. yum update
3. reboot
  
Actual results:

|
Elapsed time since release of system processors: 50075 mins 22 secs

Config file read, 1024 bytes

Welcome to Fedora!
Hit <TAB> for boot options
Welcome to yaboot version 1.3.14 (Red Hat 1.3.14-9.fc11)
Enter "help" to get some basic usage information
boot: 2.6.29-0.99.rc4
Please wait, loading kernel...
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 03600000, size: 3515 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 rhgb quiet root=/dev/SNAKEVG/SNAKEROOT 
memory layout at init:
  alloc_bottom : 000000000396f000
  alloc_top    : 0000000008000000
  alloc_top_hi : 00000000f5000000
  rmo_top      : 0000000008000000
  ram_top      : 00000000f5000000
Looking for displays
instantiating rtas at 0x00000000076a1000 ... done
boot cpu hw idx 0000000000000000
starting cpu hw idx 0000000000000002... done
starting cpu hw idx 0000000000000004... done
starting cpu hw idx 0000000000000006... done
copying OF device tree ...
Building dt strings...
Building dt structure...
DEFAULT CATCH!, exception-handler=DEFAULT CATCH!, exception-handler=fffffffffffffff6 
at   %SRR0: 0000000000c3bf3c   %SRR1: 800000000000b002 
Call History
------------
throw  - 93903c 
$call-method  - 946d5c 
(poplocals)  - 93a758 
key-fillq  - 94727c 
?xoff  - 947378 
(poplocals)  - 93a758 
(stdout-write)  - 9479a4 
(type)  - 947a30 
_syscatch  - 94df8c 
_exception  - 94d500 
<excp>  - 939890 
_syscatch  - 94def0 
_syscatch  - 94def0 
invalid pointer - 3d6010013ae96970 
invalid pointer - 3aab52803d201001 
invalid pointer - 3d6010013c00cccc 

Client's Fix Pt Regs:
 00 00100000000001f4 0000000060000000 00000000deadbeef fffffffffffffffc
 04 0000000000000000 0000000000000000 0000000000000000 0000000000000001
 08 0000000000001000 0000030002001000 0000000000000003 0000000000007000
 0c 0000000022000044 0000000000000000 000000000021a354 000000000021a3a4
 10 0000000000e3dd70 0000000000e3dd70 0000000000c4e728 0000000000c546e8
 14 0000000000000000 0000000000c81948 0000000000000028 00000000024d15f8
 18 0000000000c13000 0000000000c38000 0000000000c15000 0000000000c16fc0
 1c 0000000000c20000 0000000000c3fdf0 0000000000c11fd8 0000000000c10fe8
Special Regs:
    %IV: 00000300     %CR: 8a000044    %XER: 20000000  %DSISR: 08000000 
  %SRR0: 0000000000c3bf3c   %SRR1: 800000000000b002 
    %LR: 0000000000c3bed0    %CTR: 0000000000000000 
   %DAR: 0000000060000000 
Virtual PID = 2 
PFW: Unable to send error log!
 ofdbg
0 > DEFAULT CATCH!, exception-handler=fff00700 
:EFAULT CATCH!, at exception-handler=  fff00300 %SRR0
  at   %SRR0:DEFAULT CATCH!, r=000000000000000GDEFAULT CATCH!,    %SRR1: DEFAULT CATCH!, 0000000000000002 
Call History
------------
evaluate  -  
invalid pointer -  
invalid pointer - a 
invalid pointer - 0 
eval  - S 
catch  - ? 
display-checkpoint  - 
 (poplocals)  - ¯ 
(checkpoint)  - ë 
?xoff  - 7 
(poplocals)  - ¯ 
(stdout-write)  -  
(emit)  - ó 
(cr  -  
cr  - 3 
_syscatch  - ï 

My Fix Pt Regs:
 00 0000000000c479f0 0000000000000000 00000000deadbeef 0000000000e3dd00
 04 0000000000000041 000000000000001a ffffffffffffffbf 0000000000c03010
 08 0000000008000000 80000000001f9ca0 80000000001f9ca0 80000000001af548
 0c 0000000000004000 0000000000000000 80000000001af46c 0000000000c00060
 10 80000000747bf404 0000000000e3dd70 0000000000c4a630 0000000000c685a9
 14 0000000000c174ff 0000000000000001 0000000000000000 0000000000000000
 18 0000000000c13000 0000000000c38000 0000000000c14ec0 0000000000c16f40
 1c 0000000000c20000 0000000000c3fdf0 0000000000c11f40 0000000000c10ff8
Special Regs:
    %IV: 00000700     %CR: 42000000    %XER: 00000002  %DSISR: 00000000 
  %SRR0: 0000000000c479f0   %SRR1: 8000000000023002 
    %LR: 0000000000c4a638    %CTR: 0000000000c479f0 
   %DAR: 0000000000000000 
Virtual PID = 6 
PFW: Unable to send error log!
þí`0(m"¯t(`6Ð jµð, unknown word


Expected results:

 * kernel should boot

Additional info:

 * See attached successful boot log from F-11-Alpha-ppc64

Comment 1 Tony Breeds 2009-02-12 04:14:42 UTC
Set Architecture to powerpc as that matches the bug description.

Comment 2 Jarod Wilson 2009-02-12 23:01:49 UTC
Set arch to ppc6, since that matches the bug description better. :)

My YDL PowerStation is failing similarly, and jwb's G5 seems to be dying around the same spot too.

Comment 3 Jarod Wilson 2009-02-12 23:03:29 UTC
fail... ppc64, I meant...

PowerStation isn't going into the exception handler, its just hanging, but its at more or less the exact same spot.

Comment 4 Jesse Brandeburg 2009-02-12 23:42:31 UTC
might be related to bug #485267 ?

Comment 5 Jarod Wilson 2009-02-13 13:42:12 UTC
(In reply to comment #4)
> might be related to bug #485267 ?

Don't think so, we're barely even getting from yaboot to the kernel here, and at least the PowerStation runs F10 just fine, including dual-head X and everything.

Comment 6 Josh Boyer 2009-02-16 15:16:04 UTC
I've noticed odd behavior since the Alpha kernels.  Oopsing on ssh, etc.  So I've started the equivalent of 'koji bisect'.  So far, the good kernels are:

kernel-2.6.28-3.fc11.ppc64
kernel-2.6.29-0.18.rc0.git9.fc11.ppc64
kernel-2.6.29-0.40.rc1.git6.fc11.ppc64

Starting with kernel-2.6.29-0.53.rc2.git1.fc11.ppc64 I get oopses from sig 4s.

Comment 7 Josh Boyer 2009-02-16 15:35:02 UTC
kernel-2.6.29-0.48.rc2.git1.fc11 works without oopses.  It's the last successful build before 0.53.  The changelog for 0.53 is:

* Mon Jan 26 2009 Kyle McMartin <kyle>
- Update git-linus.diff to bf50c903faba4ec7686ee8a570ac384b0f20814d.
- drm-next.patch merged.
- linux-2.6.28-sunrpc-ipv6-rpcbind.patch: update for Kconfig moves.

* Sat Jan 24 2009 Hans de Goede <hdegoede>
- Fix atk0110 sensor numbering

* Fri Jan 23 2009 Hans de Goede <hdegoede>
- Change acpi_enforce_resources default to strict, this will cause hwmon
  drivers which clash with io resources reserved by ACPI to no longer load,
  avoiding both the ACPI code and the native driver trying to drive the same
  IC at the same time
- Add ASUS ACPI hwmon interface driver (atk0110), this will give (restore)
  hwmon functionality on most ASUS boards through the firmware


Since acpi and atk0110 don't apply to this class of machine, perhaps either the drm-next.patch or git-linus.diff are the "bad" changes.

Comment 8 Josh Boyer 2009-02-16 17:59:19 UTC
For grins, I tried a vanilla -rc5 build.  This hangs after opening the console device, so whatever change causes this problem is in the upstream kernel.  Seems somewhere between rc2 and rc5.  Joy.

Comment 9 Josh Boyer 2009-02-17 12:49:32 UTC
I spent most of yesterday doing a git bisect between -rc2 and -rc5.  None of the kernels worked.  They all hung after opening the display device (or more likely failed after but the output wasn't caught on the screen).

I spoke to Ben Herrenschmidt a bit last night.  He tried a g5_defconfig build and a build using the fedora .config file.  Both of his kernels worked on his dual G5 which is mostly identical to mine.  We have fairly sizeable toolchain differences though, since he was building on some Ubuntu box and I was building on rawhide.

For grins this morning, I did a 'make local' of a devel kernel on my F9 box.  Copied this over and rebooted and it works just fine.

[jwboyer@localhost ~]$ uname -a
Linux localhost.localdomain 2.6.29-0.119.rc5.fc11.ppc64 #1 SMP Tue Feb 17 07:00:20 EST 2009 ppc64 ppc64 ppc64 GNU/Linux


I'm beginning to suspect that it might be the toolchain in rawhide.

Comment 10 Josh Boyer 2009-02-17 15:04:20 UTC
[jwboyer@localhost boot]$ file vmlinuz-2.6.29-0.124.rc5.fc11.ppc64 
vmlinuz-2.6.29-0.124.rc5.fc11.ppc64: ELF 64-bit MSB shared object, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
[jwboyer@localhost boot]$ strings vmlinuz-2.6.29-0.124.rc5.fc11.ppc64 | grep gcc
Linux version 2.6.29-0.124.rc5.fc11.ppc64 (jwboyer) (gcc version 4.4.0 20090213 (Red Hat 4.4.0-0.18) (GCC) ) #1 SMP Tue Feb 17 08:13:43 EST 2009

The above kernel fails to boot.  Installing identical kernel built on F9:

[jwboyer@localhost ~]$ sudo yum remove kernel-2.6.29-0.124.rc5.fc11.ppc64
Loaded plugins: refresh-packagekit
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package kernel.ppc64 0:2.6.29-0.124.rc5.fc11 set to be erased
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package       Arch         Version                      Repository        Size
================================================================================
Removing:
 kernel        ppc64        2.6.29-0.124.rc5.fc11        installed         95 M

Transaction Summary
================================================================================
Install      0 Package(s)         
Update       0 Package(s)         
Remove       1 Package(s)         

Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Erasing        : kernel                                                   1/1 

Removed:
  kernel.ppc64 0:2.6.29-0.124.rc5.fc11                                          

Complete!
[jwboyer@localhost ~]$ sudo yum localinstall --nogpgcheck ./kernel-2.6.29-0.124.rc5.fc11.ppc64.rpm 
Loaded plugins: refresh-packagekit
Setting up Local Package Process
Examining ./kernel-2.6.29-0.124.rc5.fc11.ppc64.rpm: kernel-2.6.29-0.124.rc5.fc11.ppc64
Marking ./kernel-2.6.29-0.124.rc5.fc11.ppc64.rpm as an update to kernel-2.6.29-0.74.rc3.git3.fc11.ppc64
Marking ./kernel-2.6.29-0.124.rc5.fc11.ppc64.rpm as an update to kernel-2.6.29-0.119.rc5.fc11.ppc64
Resolving Dependencies
--> Running transaction check
---> Package kernel.ppc64 0:2.6.29-0.124.rc5.fc11 set to be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package
   Arch  Version                 Repository                                Size
================================================================================
Installing:
 kernel
   ppc64 2.6.29-0.124.rc5.fc11   ./kernel-2.6.29-0.124.rc5.fc11.ppc64.rpm  91 M

Transaction Summary
================================================================================
Install      1 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         

Total download size: 91 M
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : kernel                                                   1/1 

Installed:
  kernel.ppc64 0:2.6.29-0.124.rc5.fc11                                          

Complete!
[jwboyer@localhost ~]$ cd /boot
[jwboyer@localhost boot]$ file vmlinuz-2.6.29-0.124.rc5.fc11.ppc64 
vmlinuz-2.6.29-0.124.rc5.fc11.ppc64: ELF 64-bit MSB shared object, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
[jwboyer@localhost boot]$ strings vmlinuz-2.6.29-0.124.rc5.fc11.ppc64 | grep gcc
Linux version 2.6.29-0.124.rc5.fc11.ppc64 (jwboyer.homelinux.org) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Tue Feb 17 08:05:39 EST 2009

That kernel boots.  It does get an oops in some of the compat stuff, but it at least boots.

Comment 11 Josh Boyer 2009-02-17 15:04:49 UTC
Jakub, any ideas on this one?

Comment 12 Josh Boyer 2009-02-17 15:19:02 UTC
I have one more experiment I'd like to try.  Rawhide has CONFIG_RELOCATABLE=y set for ppc64 kernels.  Going to try to disable that, rebuild on both F9 and rawhide, and see what happens.

Comment 13 Jakub Jelinek 2009-02-17 15:48:56 UTC
Try the usual stuff, unless it is obvious where the problem is, if it works compiled with gcc 4.3.2 and doesn't with 4.4.0, do a binary search among .o files built by both compilers to narrow down which .c file matters, then try to narrow it to a function, build a self-contained testcase from it.  If e.g. building the kernel with 4.4.0 but say -O1 instead of -O2 (or some other option combination) works, you can do the binary search between -O1 and -O2 built objects etc.

Comment 14 Josh Boyer 2009-02-17 17:18:34 UTC
(In reply to comment #12)
> I have one more experiment I'd like to try.  Rawhide has CONFIG_RELOCATABLE=y
> set for ppc64 kernels.  Going to try to disable that, rebuild on both F9 and
> rawhide, and see what happens.

CONFIG_RELOCATABLE not set still doesn't help rawhide/gcc 4.4 builds.  It seems to make the oops on ssh I was seeing with gcc 4.3.2 go away though.  Interesting, but not really that much help.

Comment 15 Josh Boyer 2009-02-17 17:46:23 UTC
(In reply to comment #13)
> Try the usual stuff, unless it is obvious where the problem is, if it works
> compiled with gcc 4.3.2 and doesn't with 4.4.0, do a binary search among .o
> files built by both compilers to narrow down which .c file matters, then try to
> narrow it to a function, build a self-contained testcase from it.  If e.g.
> building the kernel with 4.4.0 but say -O1 instead of -O2 (or some other option
> combination) works, you can do the binary search between -O1 and -O2 built
> objects etc.

OK.  So I took "binary search among .o files" to mean:

"copy the .o files from gcc 4.4 into a gcc 4.3 build tree.  recompile vmlinux.  test"

Hopefully I got that part right.  If not, please yell now because that's what I did.

Going on the assumption above, and given the proximity of the failure/hang, I copied arch/powerpc/kernel/prom_init.o from the gcc 4.4 build tree to the gcc 4.3 build tree and redid 'make vmlinux'.  Copied the resulting vmlinux onto the machine and rebooted, and it hangs just like a full kernel built with 4.4.

Comment 16 Josh Boyer 2009-02-17 18:44:17 UTC
Created attachment 332273 [details]
objdump -d output of gcc 4.3 prom_init.o

Comment 17 Josh Boyer 2009-02-17 18:44:56 UTC
Created attachment 332274 [details]
objdump -d output of gcc 4.4 prom_init.o

Comment 18 Josh Boyer 2009-02-17 18:47:11 UTC
Attached the objdump output of the differing .o files above.  I haven't had time to look at them in detail, but in my 5 minute glance at them I noticed a distinct lack of mtctr instructions in functions that use va_start/va_end/var args stuff in the gcc4.4 file.  See call_prom_ret for example.

Comment 19 Peter Bergner 2009-02-17 19:00:34 UTC
The lack of fewer mtctr is due to the compiler no longer using bdnz (ie, branch on count reg) loops with the particular gcc 4.4 revision.

Comment 20 Josh Boyer 2009-02-17 20:49:08 UTC
I tried rebuilding the kernel with -O2 instead of -Os using gcc 4.4.  The following was emitted during the build:

drivers/md/bitmap.c: In function ‘bitmap_count_page’:
drivers/md/bitmap.c:1070: internal compiler error: in reload, at reload1.c:1173
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/cc5gfyny.out file, please attach this to your bugreport.

Comment 21 Josh Boyer 2009-02-17 20:49:51 UTC
Oh, and:

net/ipv6/addrconf_core.c: In function ‘__ipv6_addr_type’:
net/ipv6/addrconf_core.c:77: internal compiler error: in reload, at reload1.c:1173
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/cchwwh9u.out file, please attach this to your bugreport.


I'll attach those files shortly

Comment 22 Josh Boyer 2009-02-17 20:51:41 UTC
Created attachment 332296 [details]
ice output

Comment 23 Josh Boyer 2009-02-17 20:52:07 UTC
Created attachment 332297 [details]
ice output2

Comment 24 Josh Boyer 2009-02-17 21:24:52 UTC
This is all with gcc-4.4.0-0.18.ppc.  I noticed there is an update in rawhide today to -19, but when looking at the changelog I can't see how any of the mentioned PRs would apply here.

Comment 25 Jakub Jelinek 2009-02-17 23:10:04 UTC
Both ICEs are likely the same bug, reduced testcase:
/* { dg-do compile } */
/* { dg-options "-O2" } */
/* { dg-options "-O2 -mtune=cell -mminimal-toc" { target { powerpc*-*-* && lp64 } } } */

struct A
{
  char *a;
  unsigned int b : 1;
  unsigned int c : 31;
};

struct B
{
  struct A *d;
};

void
foo (struct B *x, unsigned long y)
{
  if (x->d[y].c)
    return;
  if (x->d[y].b)
    x->d[y].a = 0;
}

Will look into this tomorrow.
That said, kernel not booting with -Os is unrelated to this, so if you could make progress on finding in which function things went wrong in prom_init.o and ideally with what arguments it has been called (i.e. try to make a self-contained testcase from it), it would be greatly appreciated.

Comment 26 Jakub Jelinek 2009-02-18 12:44:22 UTC
The ICEs are now tracked as http://gcc.gnu.org/PR39226 upstream.  In the light of tho, can you try prom_init.c built with 4.4 and -Os, but without -mtune=cell (say -mtune=power4)?

Also, do you have a rough idea into which function in prom_init.c to look?  And, can you attach preprocessed prom_init.i and the list of gcc options used to compile it?

Thanks.

Comment 27 Josh Boyer 2009-02-18 12:57:39 UTC
(In reply to comment #26)
> The ICEs are now tracked as http://gcc.gnu.org/PR39226 upstream.  In the light
> of tho, can you try prom_init.c built with 4.4 and -Os, but without -mtune=cell
> (say -mtune=power4)?

Yes.  Actually, because of some config options set for the kernel, both -mtune=power4 and -mtune=cell are getting passed.  I believe the latter one "wins".

I'll unset the option that causes it to tuned for cell today.

> Also, do you have a rough idea into which function in prom_init.c to look? 

Unfortunately, not yet.  The sucky part about prom_init.c is that all of it runs before the kernel has relocated itself to the normal addresses.  It is still doing calls into OF for various things in this file before doing that relocation.

I made a bit more progress on finding a failing function by commenting out the prom_check_displays function, which was causing the screen to be blanked.  After I did that, I see that I get an exception from OF during what I believe is the scan_dt_build_struct function, which is called from flatten_device_tree.

More info as I find it.

> And, can you attach preprocessed prom_init.i and the list of gcc options used
> to compile it?

I'll get the .i file as soon as I can.  The compile options for both 4.3 and 4.4 are here:

http://fpaste.org/paste/3900

Comment 28 Josh Boyer 2009-02-18 16:09:18 UTC
Created attachment 332404 [details]
prom_init.i output from gcc 4.4

prom_init.i output, generated with:

gcc -m64 -Wp,-MD,arch/powerpc/kernel/.prom_init.o.d  -nostdinc -isystem /usr/lib/gcc/ppc64-redhat-linux/4.4.0/include -Iinclude  -I/home/jwboyer/src/kernel/devel/kernel-2.6.28/linux-2.6.28.ppc64/arch/powerpc/include -include include/linux/autoconf.h -D__KERNEL__ -Iarch/powerpc -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -msoft-float -pipe -Iarch/powerpc -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mtune=cell -mno-altivec -mno-spe -mspe=no -funit-at-a-time -mno-string -Wa,-maltivec -Wframe-larger-than=2048 -fno-stack-protector -fomit-frame-pointer -g -Wdeclaration-after-statement -Wno-pointer-sign -mno-minimal-toc  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(prom_init)"  -D"KBUILD_MODNAME=KBUILD_STR(prom_init)"  -E -o arch/powerpc/kernel/prom_init.i arch/powerpc/kernel/prom_init.c

Comment 29 Jakub Jelinek 2009-02-18 17:12:13 UTC
Thanks.  Could you also try building prom_init.c with -O0 and see if that doesn't help?  If -O0 works, GCC 4.4 has __attribute__((__optimize__(N))) for N 0-3, which you perhaps could use to narrow it down to a particular function (compile everything with -Os, and for a bunch of functions add __attribute__((__optimize__(0))) to compile them at -O0.

Comment 30 Josh Boyer 2009-02-18 18:06:24 UTC
(In reply to comment #29)
> Thanks.  Could you also try building prom_init.c with -O0 and see if that
> doesn't help?  

It does actually.  Both -O0 and -O1 builds of prom_init.c allow the system to boot (when the .o is copied to the 4.3 kernel and the vmlinux is rebuilt with it.)

-O2 and -Os fail similarly.

> If -O0 works, GCC 4.4 has __attribute__((__optimize__(N))) for N
> 0-3, which you perhaps could use to narrow it down to a particular function
> (compile everything with -Os, and for a bunch of functions add
> __attribute__((__optimize__(0))) to compile them at -O0.

I'll try this now.

Comment 31 Josh Boyer 2009-02-18 18:09:26 UTC
Oh, I did try removing -mtune=cell however that does not appear to make a difference.

Comment 32 Josh Boyer 2009-02-18 19:24:34 UTC
Created attachment 332437 [details]
Instrumented prom_init.c

It seems prom_claim is the function that needs the -O0 optimization.  Building this C file with:

gcc -m64 -Wp,-MD,arch/powerpc/kernel/.prom_init.o.d  -nostdinc -isystem /usr/lib/gcc/ppc64-redhat-linux/4.4.0/include -Iinclude  -I/home/jwboyer/src/kernel/devel/kernel-2.6.28/linux-2.6.28.ppc64/arch/powerpc/include -include include/linux/autoconf.h -D__KERNEL__ -Iarch/powerpc -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -msoft-float -pipe -Iarch/powerpc -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mtune=cell -mno-altivec -mno-spe -mspe=no -funit-at-a-time -mno-string -Wa,-maltivec -Wframe-larger-than=2048 -fno-stack-protector -fomit-frame-pointer -g -Wdeclaration-after-statement -Wno-pointer-sign -mno-minimal-toc  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(prom_init)"  -D"KBUILD_MODNAME=KBUILD_STR(prom_init)"  -c -o arch/powerpc/kernel/prom_init.o arch/powerpc/kernel/prom_init.c

copying the resulting .o to the gcc 4.3 tree, and booting the resulting vmlinux works.

Comment 33 Josh Boyer 2009-02-18 19:57:17 UTC
Looking at that function, I noticed that the bulk of it should be optimized out given that:

if (align == 0 && (OF_WORKAROUNDS & OF_WA_CLAIM))

will always evaluate to false since OF_WORKAROUNDS is #defined to 0 on CONFIG_PPC64.  So the function logically boils down to this:

static unsigned int __init prom_claim(unsigned long virt, unsigned long size,
                                      unsigned long align)
{

    struct prom_t *_prom = &RELOC(prom);

    return call_prom("claim", 3, 1, (prom_arg_t)virt, (prom_arg_t)size,
                     (prom_arg_t)align);
}

Doing an objdump -d of the bad prom_init.o, I can't see anywhere that call_prom is actually called, and call_prom is not an inlined function in this case.

The call_prom call is pretty important here.  prom_claim is called from alloc_up and alloc_down, which is used to claim memory from OF (roughly speaking).  These findings are also consistent with the little amount of crash data I was seeing, as the offending offset was nearby a make_room call, which calls alloc_up (which is supposed to call prom_claim->call_prom).

Comment 34 Jakub Jelinek 2009-02-18 20:24:35 UTC
With -Os and no optimize attribute prom_claim is versioned (so that it is only called with 2 arguments instead of 3, align is not used) and in the tree optimized dump I still see it (and prom_printf ... trying: 0x... calls before that).  Nothing in the assembly references those, so it is optimized out during RTL optimizations.

Comment 35 Jakub Jelinek 2009-02-18 20:32:52 UTC
Actually no, it is just section anchoring, as all the strings are in .rodata
and so they are referenced through .LANCHOR1 + offset.
In my .s file "claim" string is at .LANCHOR1 + 898, and the T.657 function
calls call_prom with that string as the first argument.

Comment 36 Jakub Jelinek 2009-02-18 20:39:25 UTC
BTW, couldn't the
struct prom_t *_prom = &RELOC(prom);
line be moved into if (align == 0 && ...) body?  Seems it isn't needed outside of that if's body and as it contains a function call, it can't be optimized out.

That's just random optimization idea, I really still have no idea where to look for a problem.

Comment 37 Josh Boyer 2009-02-18 22:08:48 UTC
(In reply to comment #35)
> Actually no, it is just section anchoring, as all the strings are in .rodata
> and so they are referenced through .LANCHOR1 + offset.
> In my .s file "claim" string is at .LANCHOR1 + 898, and the T.657 function
> calls call_prom with that string as the first argument.

Yes, I think you're right.  I see similar things in my objdump.  I'll attach the good and bad objdumps i have.

Comment 38 Josh Boyer 2009-02-18 22:09:38 UTC
Created attachment 332463 [details]
objdump -d output of good prom_init.o

Comment 39 Josh Boyer 2009-02-18 22:10:28 UTC
Created attachment 332465 [details]
objdump -d output of failing prom_init.o

Comment 40 Josh Boyer 2009-02-18 23:51:45 UTC
Just so that everyone sees what I'm seeing, here are some pictures of the early prom stuff from a good and bad kernel.  (This isn't part of dmesg, hence the jpgs).

Good:

http://jwboyer.fedorapeople.org/ppc64-good.jpg

Bad:

http://jwboyer.fedorapeople.org/ppc64-bad.jpg

You can definitely see oddness in the alloc_up call that is done.  Bad returns ffffffffffffffff (-1?), while the good succeeds.

Comment 41 Jakub Jelinek 2009-02-19 10:14:39 UTC
Thanks, that was enough to find out what's wrong.  Surprisingly, this doesn't appear to be a regression, but a long standing ppc -m64 sibcall optimization bug.

extern void abort (void);

__attribute__ ((noinline))
static int foo (int x)
{
  return x;
}

__attribute__ ((noinline))
unsigned int bar (int x)
{
  return foo (x + 6);
}

unsigned long l = (unsigned int) -4;

int
main (void)
{
  if (bar (-10) != l)
    abort ();
  return 0;
}

works when compiled with -m32 (any optimization level) or -m64 -O{0,1},
or -m64 -O{2,3,s} -fno-optimize-sibling-calls, but aborts for -m64 -O{2,3,s},
with all of 4.1.x, 4.3.x and trunk GCCs.  PPC64 psABI says:
"Functions shall return values of type int, long, enum, short, and char, or a pointer to any type, as unsigned or signed integers as appropriate, zero- or sign-extended to 64 bits if necessary, in r3."
which really means that it is not valid to do a sibcall in between a function that returns < 64-bit signed integral and a function that returns < 64-bit unsigned integral, as the value must be sign-extended to 64-bits in the first case and zero-extended in the second case.
Surprises me this wasn't discovered years ago.

Comment 42 Josh Boyer 2009-02-19 11:55:13 UTC
(In reply to comment #41)
> Thanks, that was enough to find out what's wrong.  Surprisingly, this doesn't
> appear to be a regression, but a long standing ppc -m64 sibcall optimization
> bug.

Awesome.  Thanks Jakub.

Doing some bugzilla housekeeping to mark this against gcc and put it as an F11-Beta blocker.  As soon as we get a fixed gcc, I'll be happy to test.

Comment 43 Josh Boyer 2009-02-25 16:15:31 UTC
I've built local kernels and used the kernels from koji built with gcc-4.4.0-21 and they all work now (aside from the unrelated module loading bug).

I believe this bug can be closed out.  Thanks again Jakub

Comment 44 Jarod Wilson 2009-02-25 16:30:23 UTC
Yup, recent kernels boot on the powerstation too, closing bug.

Comment 45 Jakub Jelinek 2009-03-04 20:17:25 UTC
The -O2 -mtune=cell ICEs discussed in #c20 through #c26 should be now fixed in gcc-4.4.0-0.22.


Note You need to log in before you can comment on or make changes to this bug.