SACHIN P. SANT <sachinp.com>- After 1st stage of TEXT MODE INSTALLATION completes, the systems (p55 -power5 ) segfaults on 1st reboot. Just before the segfault following badness message is displayed on the console : returning from prom_init Phyp-dump not supported on this hardware ibmvscsi 3000000b: fast_fail not supported in server ------------[ cut here ]------------ Badness at lib/dma-debug.c:820 sd 0:0:1:0: [sda] Assuming drive cache: write through sd 0:0:1:0: [sda] Assuming drive cache: write through sd 0:0:1:0: [sda] Assuming drive cache: write through This same badness message can be recreated with latest upstream kernels. Here is the complete back trace(against vanilla kernel) : ibmvscsi 30000003: Client reserve enabled ibmvscsi 30000003: sent SRP login ibmvscsi 30000003: SRP_LOGIN succeeded ibmvscsi 30000003: DMA-API: device driver frees DMA memory with wrong function [device address=0x0000000000011520] [size=36 bytes] [mapped as scather-gather] [unmapped as single] ------------[ cut here ]------------ Badness at lib/dma-debug.c:820 NIP: c00000000039bd24 LR: c00000000039bd20 CTR: c0000000000704a4 REGS: c00000000f69f6f0 TRAP: 0700 Tainted: G W (2.6.34-rc3) MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 48000082 XER: 00000004 TASK = c00000000125cc70[0] 'swapper' THREAD: c000000001324000 CPU: 0 GPR00: c00000000039bd20 c00000000f69f970 c000000001322e38 00000000000000b6 GPR04: 0000000000000001 c0000000000c1ea8 0000000000000000 0000000000000002 GPR08: 0000000000000000 c00000000125cc70 0000000000000f66 0000000000000001 GPR12: 0000000000000002 c00000000f669000 0000000000d47940 0000000001c00000 GPR16: ffffffffffffffff 0000000002673148 00000000018ff984 0000000000000006 GPR20: 0000000000000000 c000000000c7bb80 0000000000000000 0000000000000000 GPR24: 0000000000000001 c000000001de6b00 0000000000000001 c000000001de6f80 GPR28: c000000109c696b0 c00000000f69fa90 c0000000012acab0 c00000000f69f970 NIP [c00000000039bd24] .check_unmap+0x3e0/0x784 LR [c00000000039bd20] .check_unmap+0x3dc/0x784 Call Trace: [c00000000f69f970] [c00000000039bd20] .check_unmap+0x3dc/0x784 (unreliable) [c00000000f69fa20] [c00000000039c3dc] .debug_dma_unmap_page+0x98/0xc8 [c00000000f69fb60] [d000000001dd63f4] .unmap_cmd_data+0xd0/0x11c [ibmvscsic] [c00000000f69fc00] [d000000001dd8878] .handle_cmd_rsp+0xe0/0x154 [ibmvscsic] [c00000000f69fca0] [d000000001dd7694] .ibmvscsi_handle_crq+0x44c/0x500 [ibmvscsic] [c00000000f69fd40] [d000000001ddaca4] .rpavscsi_task+0x50/0xd8 [ibmvscsic] [c00000000f69fdf0] [c0000000000c9e84] .tasklet_action+0x108/0x1d4 [c00000000f69fea0] [c0000000000cb778] .__do_softirq+0x168/0x2b8 [c00000000f69ff90] [c0000000000337b0] .call_do_softirq+0x14/0x24 [c000000001327840] [c000000000010664] .do_softirq+0xa0/0x104 [c0000000013278e0] [c0000000000cb0e4] .irq_exit+0x70/0xd0 [c000000001327960] [c00000000000fee4] .do_IRQ+0x214/0x2d8 [c000000001327a20] [c000000000004d28] hardware_interrupt_entry+0x28/0x2c --- Exception: 501 at .raw_local_irq_restore+0xc0/0xdc LR = .cpu_idle+0x12c/0x1d0 [c000000001327d10] [c000000001290a28] mv88e6131_switch_driver+0x8da0/0x35588 (unreliable) [c000000001327db0] [c000000000017e14] .cpu_idle+0x12c/0x1d0 [c000000001327e50] [c00000000000a71c] .rest_init+0xe8/0x10c [c000000001327ee0] [c000000000a12e38] .start_kernel+0x4ec/0x510 [c000000001327f90] [c000000000008c64] .start_here_common+0x2c/0x48 Instruction dump: e81c001a e93d001a e97e8030 78001f24 79291f24 e87e80c0 e8dd0028 e8fd0030 7d0b002a 7d2b482a 48393a49 60000000 <0fe00000> 480000b8 2f800003 409e00f4 Mapped at: [<c00000000039c76c>] .debug_dma_map_sg+0xa0/0x220 [<c0000000005085c4>] .scsi_dma_map+0x120/0x164 [<d000000001dd8a6c>] .ibmvscsi_queuecommand+0x180/0x5d0 [ibmvscsic] [<c0000000004fd9b4>] .scsi_dispatch_cmd+0x21c/0x2cc [<c000000000506058>] .scsi_request_fn+0x3cc/0x57c scsi 0:0:1:0: Direct-Access AIX VDASD 0001 PQ: 0 ANSI: 3 This problem has been reported to community. Here is the link : http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-April/081541.html The following patch should fix this issue : http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-April/081545.html Please include the above patch in F13.
------- Comment From Subrata Modak subrata.ibm.com 2010-05-03 14:43 EDT------- The issue is still reproducible with . The following error still occurs. Probably, the proposed patch has not made to the fedora packages: Segmentation fault No root device found Segmentation fault No root device found Boot has failed, sleeping forever. Redhat, Any Information on this ? Regards-- Subrata
------- Comment From Subrata Modak subrata.ibm.com 2010-05-04 07:53 EDT------- Redhat, Any news about this patch going into Fedora Kernel? Regards-- Subrata
------- Comment From 2010-06-10 06:29 EDT------- I checked the today's fedora kernel/devel/kernel-2.6.34/linux-2.6.34.noarch and I has the patch merged.
------- Comment From edpollar.com 2010-06-16 12:02 EDT------- reassigning QA...
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle. Changing version to '14'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
------- Comment From Subrata Modak subrata.ibm.com 2010-07-30 07:35 EDT------- >EDT------- >This bug appears to have been reported against 'rawhide' during the Fedora 14 >development cycle. Nope. This was produced during Fedora 13 Alpha release itself. And still appears in the F13-GA kernel (2.6.33.3-85). Regards-- Subrata >Changing version to '14'. > >More information and reason for this action is here: >http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 13 is EOL and there was never a supported ppc64 release. Closing this out.