Bug 951631 - Kernel exception during F19 pre-Alpha install on ppc64: Exception: 901 at .plpar_hcall_norets+0x84/0xd4
Summary: Kernel exception during F19 pre-Alpha install on ppc64: Exception: 901 at .pl...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: ppc64
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F19PPCBeta, F19PPCBetaBlocker, PPCBetaBlocker
TreeView+ depends on / blocked
 
Reported: 2013-04-12 16:11 UTC by Gustavo Luiz Duarte
Modified: 2013-06-05 17:51 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-05-07 13:41:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Full boot log (41.72 KB, text/x-log)
2013-04-12 16:11 UTC, Gustavo Luiz Duarte
no flags Details
reboot (206.78 KB, text/plain)
2013-04-16 17:40 UTC, IBM Bug Proxy
no flags Details

Description Gustavo Luiz Duarte 2013-04-12 16:11:43 UTC
Created attachment 734817 [details]
Full boot log

Description of problem:
Booting the Fedora 19 TC6.2 installation DVD on a ppc64 machine I get the following kernel exception. 
It is a 730 system with a single dedicated lpar and a physical Fedora19 TC6.2 media in the DVD drive. Attached the full boot log.


[   54.242685] =========================================================^M
[   54.242689] [ INFO: possible irq lock inversion dependency detected ]^M
[   54.242694] 3.9.0-0.rc4.git0.1.fc19.ppc64 #1 Not tainted^M
[   54.242698] ---------------------------------------------------------^M
[   54.242702] swapper/9/0 just changed the state of lock:^M
[   54.242705]  (&(&tp->lock)->rlock){+.-...}, at: [<d00000000542377c>] .tg3_timer+0x9c/0x1170 [tg3]^M
[   54.242723] but this lock took another, SOFTIRQ-unsafe lock in the past:^M
[   54.242726]  (devtree_lock){+.+...}^M
^M
and interrupts could create inverse lock ordering between them.^M
^M
[   54.242735] ^M
[   54.242735] other info that might help us debug this:^M
[   54.242739] Chain exists of:^M
  &(&tp->lock)->rlock --> pci_lock --> devtree_lock^M
^M
[   54.242751]  Possible interrupt unsafe locking scenario:^M
[   54.242751] ^M
[   54.242755]        CPU0                    CPU1^M
[   54.242758]        ----                    ----^M
[   54.242762]   lock(devtree_lock);^M
[   54.242767]                                local_irq_disable();^M
[   54.242770]                                lock(&(&tp->lock)->rlock);^M
[   54.242776]                                lock(pci_lock);^M
[   54.242782]   <Interrupt>^M
[   54.242784]     lock(&(&tp->lock)->rlock);^M
[   54.242789] ^M
[   54.242789]  *** DEADLOCK ***^M
[   54.242789] ^M
[   54.242795] 1 lock held by swapper/9/0:^M
[   54.242798]  #0:  ((&tp->timer)){+.-...}, at: [<c0000000000bc2f0>] .call_timer_fn+0x0/0x3c0^M
[   54.242810] ^M
[   54.242810] the shortest dependencies between 2nd lock and 1st lock:^M
[   54.242815]   -> (devtree_lock){+.+...} ops: 53674206298112 {^M
[   54.242824]      HARDIRQ-ON-W at:^M
[   54.242828]      SOFTIRQ-ON-W at:^M
[   54.242832]      INITIAL USE at:^M
[   54.242835]    }^M
[   54.242838]    ... key      at: [<c000000001529758>] devtree_lock+0x18/0x48^M
[   54.242845]    ... acquired at:^M
[   54.242848] ^M
[   54.242850]  -> (pci_lock){......} ops: 4840428142592 {^M
[   54.242859]     INITIAL USE at:^M
[   54.242862]   }^M
[   54.242865]   ... key      at: [<c00000000150abe8>] pci_lock+0x18/0x48^M
[   54.242871]   ... acquired at:^M
[   54.242874] ^M
[   54.242876] -> (&(&tp->lock)->rlock){+.-...} ops: 1327144894464 {^M
[   54.242885]    HARDIRQ-ON-W at:^M
[   54.242889]    IN-SOFTIRQ-W at:^M
[   54.242892]    INITIAL USE at:^M
[   54.242896]  }^M
[   54.242898]  ... key      at: [<d00000000542e221>] __key.48769+0x0/0xffffffffffffaa67 [tg3]^M
[   54.242906]  ... acquired at:^M
[   54.242909] ^M
[   54.242911] ^M
[   54.242911] stack backtrace:^M
[   54.242915] Call Trace:^M
[   54.242920] [c0000003d8d72d60] [c000000000016cb0] .show_stack+0x130/0x200 (unreliable)^M
[   54.242928] [c0000003d8d72e30] [c000000000137808] .print_irq_inversion_bug+0x258/0x2e0^M
[   54.242934] [c0000003d8d72ed0] [c000000000137930] .check_usage_forwards+0xa0/0x140^M
[   54.242940] [c0000003d8d72fc0] [c000000000138734] .mark_lock+0x394/0x780^M
[   54.242946] [c0000003d8d73070] [c0000000001399c8] .__lock_acquire+0x618/0x1c80^M
[   54.242952] [c0000003d8d731f0] [c00000000013b90c] .lock_acquire+0xac/0x250^M
[   54.242959] [c0000003d8d732c0] [c0000000008f0a0c] ._raw_spin_lock+0x5c/0xc0^M
[   54.242966] [c0000003d8d73350] [d00000000542377c] .tg3_timer+0x9c/0x1170 [tg3]^M
[   54.242972] [c0000003d8d73400] [c0000000000bc3a8] .call_timer_fn+0xb8/0x3c0^M
[   54.242977] [c0000003d8d734e0] [c0000000000bc9d8] .run_timer_softirq+0x2e8/0x440^M
[   54.242984] [c0000003d8d735f0] [c0000000000b01e8] .__do_softirq+0x178/0x540^M
[   54.242989] [c0000003d8d736f0] [c0000000000b0838] .irq_exit+0xe8/0x100^M
[   54.242996] [c0000003d8d73770] [c00000000002022c] .timer_interrupt+0x16c/0x500^M
[   54.243002] [c0000003d8d73820] [c0000000000024f4] decrementer_common+0x174/0x180^M
[   54.243010] --- Exception: 901 at .plpar_hcall_norets+0x84/0xd4^M
[   54.243010]     LR = .check_and_cede_processor+0x48/0x80^M
[   54.243017] [c0000003d8d73b10] [c0000000000852f8] .check_and_cede_processor+0x18/0x80 (unreliable)^M
[   54.243024] [c0000003d8d73b80] [c0000000000853e8] .dedicated_cede_loop+0x88/0x150^M
[   54.243031] [c0000003d8d73c40] [c00000000070e12c] .cpuidle_enter+0x2c/0x40^M
[   54.243037] [c0000003d8d73cb0] [c00000000070eb4c] .cpuidle_idle_call+0xfc/0x4f0^M
[   54.243044] [c0000003d8d73d70] [c000000000074ef8] .pSeries_idle+0x18/0x40^M
[   54.243049] [c0000003d8d73de0] [c000000000018c48] .cpu_idle+0x198/0x370^M
[   54.243055] [c0000003d8d73eb0] [c0000000009095b4] .start_secondary+0x4f8/0x500^M
[   54.243062] [c0000003d8d73f90] [c0000000000095fc] .start_secondary_prolog+0x10/0x14^M


Version-Release number of selected component (if applicable):
kernel-3.9.0-0.rc4.git0.1.fc19.ppc64

How reproducible:
Always

Steps to Reproduce:
1. Boot from Fedora 19 TC6.2 installation media
2.
3.
  
Actual results:
Attached full boot log.


Expected results:


Additional info:

Comment 1 IBM Bug Proxy 2013-04-16 17:40:48 UTC
It is worth noting that I hit this bug also booting an installed system, not only during installation.

Trying to do a reboot on the installed system gives me an even uglier output (attached) then it gets stuck and I have to restart the lpar using HMC.

Comment 2 IBM Bug Proxy 2013-04-16 17:40:59 UTC
Created attachment 736468 [details]
reboot



Console output of a reboot attempt.

Comment 3 IBM Bug Proxy 2013-04-17 02:21:28 UTC
The reboot problem looks like a different issue, please open another bug for it.

Comment 4 IBM Bug Proxy 2013-05-07 13:31:07 UTC
This bug seems to be fixed on latest kernel available in Fedora 19 (3.9.0-301.fc19). I tried both ppc64 and ppc64p7.
If anyone is still experiencing this issue with current kernel in f19 please reopen this bug.

Comment 5 IBM Bug Proxy 2013-06-05 17:51:21 UTC
This issue is fixed in fc19. So closing this bug.


Note You need to log in before you can comment on or make changes to this bug.