Bug 2060064 - Nouveau driver panics kernel on NVidia Jetson nano
Summary: Nouveau driver panics kernel on NVidia Jetson nano
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 35
Hardware: aarch64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2022-03-02 15:56 UTC by William Cohen
Modified: 2022-12-13 16:48 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 16:48:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description William Cohen 2022-03-02 15:56:09 UTC
1. Please describe the problem:

Noticed that the jetson nano board failed to boot when the HDMI wasn't connected to a monitor.  Connected serial console and found that the kernel had a panic when HDMI wasn't plugged in.


2. What is the Version-Release number of the kernel:
kernel-5.16.11-200.fc35.aarch64

3. Did it work previously in Fedora? 

Noticed that the initial kernel included in Fedora 35, kernel-5.14.10-300.fc35 it worked fine headless.

Problem was also observed in earlier 5.15.xx kernels

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

With the Fedora 35 raw sd card image from https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/armhfp/images/Fedora-Workstation-35-1.2.armhfp.raw.xz

Set up a sd card following the instructions at https://nullr0ute.com/2020/11/installing-fedora-on-the-nvidia-jetson-nano/ using 

Note that the /boot/loader/entries/* options need to be corrected with the following to obtain console output:

console=ttyS0,115200  

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

With kernel-5.17.0-0.rc6.109.fc37.aarch64 the board boots up headless.  However, 
HDMI plugged in the  kernel-5.17.0-0.rc6.109.fc37.aarch64 panics in the nouveau driver.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No out-of-tree modules loaded.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

From the serial console for kernel-5.16.11-200.fc35.aarch64 headless:

           
[    7.780203] drm drm: [drm] Cannot find any crtc or sizes                     
[    7.785800] drm drm: [drm] Cannot find any crtc or sizes                     
[    7.788550] [drm] Initialized tegra 1.0.0 20120330 for drm on minor 0        
[    7.791284] drm drm: [drm] Cannot find any crtc or sizes                     
[    7.799419] Failed to set up IOMMU for device 57000000.gpu; retaining platfor
m DMA ops                                                                       
[   49.922584] SError Interrupt on CPU2, code 0xbf000002 -- SError              
[   49.922612] CPU: 2 PID: 378 Comm: kworker/u8:3 Not tainted 5.16.11-200.fc35.a
arch64 #1                                                                       
[   49.922624] Hardware name: nvidia p3450-0000/p3450-0000, BIOS 2020.10 10/06/2
020                                                                             
[   49.922632] Workqueue: events_unbound deferred_probe_work_func               
[   49.922664] pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)  
[   49.922672] pc : el1_interrupt+0x20/0x54                                     
[   49.922688] lr : el1h_64_irq_handler+0x18/0x24                               
[   49.922694] sp : ffff80000b9b38c0                                            
[   49.922698] x29: ffff80000b9b38c0 x28: ffff00008683c300 x27: ffff00008b7e7c48
[   49.922707] x26: ffff800001691150 x25: ffff800001691150 x24: 0000000057000000
[   49.922715] x23: 0000000060400005 x22: ffff8000015562d0 x21: 00000000db0ac000
[   49.922722] x20: ffff800008200190 x19: ffff80000b9b38f0 x18: 0000000000000000
[   49.922729] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffdc41bbe0
[   49.922737] x14: 0000000000000000 x13: 0000000000000000 x12: ffff8000096b6ec8
[   49.922743] x11: ffff8000081bf000 x10: 0000000000000000 x9 : ffff8000085138e0
[   49.922751] x8 : ffff000087c6f800 x7 : ffff80000aafb710 x6 : 0068000057e00f11
[   49.922759] x5 : ffff0000804bb310 x4 : 0000000000000000 x3 : 0000000057e00000
[   49.922766] x2 : 0068000057e00f10 x1 : 00000000000000c0 x0 : ffff80000b9b38f0
[   49.922781] Kernel panic - not syncing: Asynchronous SError Interrupt        
[   49.922787] CPU: 2 PID: 378 Comm: kworker/u8:3 Not tainted 5.16.11-200.fc35.a
arch64 #1                                                                       
[   49.922792] Hardware name: nvidia p3450-0000/p3450-0000, BIOS 2020.10 10/06/2
020                                                                             
[   49.922794] Workqueue: events_unbound deferred_probe_work_func               
[   49.922803] Call trace:                                                      
[   49.922806]  dump_backtrace+0x0/0x1c0                                        
[   49.922821]  show_stack+0x24/0x30                                            
[   49.922825]  dump_stack_lvl+0x68/0x84                                        
[   49.922840]  dump_stack+0x18/0x34                                            
[   49.922846]  panic+0x134/0x338                                               
[   49.922858]  nmi_panic+0x98/0xa0                                             
[   49.922871]  arm64_serror_panic+0x7c/0x90                                    
[   49.922881]  do_serror+0x34/0x6c                                             
[   49.922885]  el1h_64_error_handler+0x30/0x50                                 
[   49.922891]  el1h_64_error+0x7c/0x80                                         
[   49.922899]  el1_interrupt+0x20/0x54                                         
[   49.922905]  el1h_64_irq_handler+0x18/0x24                                   
[   49.922909]  el1h_64_irq+0x7c/0x80                                           
[   49.923293] VDD_5V_USB: disabling                                            
[   49.922912]  nvkm_device_ctor+0x1b40/0x3d50 [nouveau]                        
[   49.923815]  nvkm_device_tegra_new+0x2e8/0x3a0 [nouveau]                     
[   49.924366]  nouveau_platform_device_create+0x4c/0x110 [nouveau]             
[   49.924978]  nouveau_platform_probe+0x38/0x90 [nouveau]                      
[   49.925443]  platform_probe+0x74/0xf0                                        
[   49.925458]  really_probe+0xc4/0x470                                         
[   49.925464]  __driver_probe_device+0x11c/0x190                               
[   49.925470]  driver_probe_device+0x48/0x110                                  
[   49.925475]  __device_attach_driver+0xa4/0x140                               
[   49.925481]  bus_for_each_drv+0x74/0xb4                                      
[   49.925489]  __device_attach+0xb8/0x1b0                                      
[   49.925493]  device_initial_probe+0x20/0x30                                  
[   49.925500]  bus_probe_device+0xa4/0xb0                                      
[   49.925505]  deferred_probe_work_func+0xc0/0x114                             
[   49.925510]  process_one_work+0x1f4/0x490                                    
[   49.925525]  worker_thread+0x184/0x500                                       
[   49.925530]  kthread+0x13c/0x140                                             
[   49.925538]  ret_from_fork+0x10/0x20                                         
[   49.925555] SMP: stopping secondary CPUs                                     
[   49.929252] Kernel Offset: 0x1f0000 from 0xffff800008000000                  
[   49.929257] PHYS_OFFSET: 0x80000000                                          
[   49.929259] CPU features: 0x11,000018c2,00000846                             
[   49.929266] Memory Limit: none                                               
[   50.216583] ---[ end Kernel panic - not syncing: Asynchronous SError Interrup
t ]---                                                                          
[   50.216638] ------------[ cut here ]------------                             
[   50.216644] WARNING: CPU: 2 PID: 378 at kernel/sched/core.c:3049 set_task_cpu
+0x158/0x1e4                                                                    
[   50.216664] Modules linked in: r8169 hid_logitech_dj(+) mmc_block nouveau rtc
_max77686 tegra_drm i2c_algo_bit drm_ttm_helper ttm drm_kms_helper crct10dif_ce 
ghash_ce syscopyarea sysfillrect sysimgblt fb_sys_fops cec gpio_keys sdhci_tegra
 drm sdhci_pltfm pwm_fan xhci_tegra sdhci phy_tegra_xusb tegra210_emc cqhci rtc_
tegra host1x i2c_tegra ipmi_devintf ipmi_msghandler fuse                        
[   50.216744] CPU: 2 PID: 378 Comm: kworker/u8:3 Not tainted 5.16.11-200.fc35.a
arch64 #1                                                                       
[   50.216752] Hardware name: nvidia p3450-0000/p3450-0000, BIOS 2020.10 10/06/2
020                                                                             
[   50.216758] Workqueue: events_unbound deferred_probe_work_func               
[   50.216767] pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)  
[   50.216773] pc : set_task_cpu+0x158/0x1e4                                    
[   50.216778] lr : try_to_wake_up+0x160/0x614                                  
[   50.216782] sp : ffff800008013d40                                            
[   50.216785] x29: ffff800008013d40 x28: ffff0000fe8fd940 x27: ffff0000fe8fd900
[   50.216793] x26: ffff80000835b5e0 x25: 0000000000000000 x24: 00000000000000c0
[   50.216800] x23: ffff80000a5e3558 x22: 00000000000000c0 x21: ffff80000a5dec00
[   50.216807] x20: 0000000000000000 x19: ffff000087f72180 x18: ffffffffffffffff
[   50.216814] x17: ffff8000f488d000 x16: ffff800008014000 x15: 0000000000004000
[   50.216821] x14: 797341203a676e69 x13: 2d2d2d5d20747075 x12: 727265746e492072
[   50.216827] x11: 6f72724553207375 x10: 6f6e6f7268636e79 x9 : ffff8000082ddb40
[   50.216834] x8 : 000000000000000f x7 : 0000000000000001 x6 : 0000000000000917
[   50.216840] x5 : ffff00008025ec00 x4 : ffff8000f484f000 x3 : 000000000000000f
[   50.216846] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000004
[   50.216853] Call trace:                                                      
[   50.216857]  set_task_cpu+0x158/0x1e4                                        
[   50.216862]  try_to_wake_up+0x160/0x614                                      
[   50.216866]  wake_up_process+0x24/0x30                                       
[   50.216871]  hrtimer_wakeup+0x2c/0x44                                        
[   50.216885]  __hrtimer_run_queues+0x134/0x2e0                                
[   50.216892]  hrtimer_interrupt+0x120/0x310                                   
[   50.216898]  tegra_timer_isr+0x34/0x44                                       
[   50.216912]  __handle_irq_event_percpu+0x68/0x1f0                            
[   50.216921]  handle_irq_event+0x5c/0x180                                     
[   50.216926]  handle_fasteoi_irq+0xcc/0x200                                   
[   50.216936]  generic_handle_domain_irq+0x48/0x70                             
[   50.216941]  gic_handle_irq+0x68/0xa0                                        
[   50.216945]  call_on_irq_stack+0x2c/0x38                                     
[   50.216950]  do_interrupt_handler+0x88/0xa0                                  
[   50.216958]  el1_interrupt+0x34/0x54                                         
[   50.216967]  el1h_64_irq_handler+0x18/0x24                                   
[   50.216972]  el1h_64_irq+0x7c/0x80                                           
[   50.216976]  panic+0x2c4/0x338                                               
[   50.216983]  nmi_panic+0x98/0xa0                                             
[   50.216991]  arm64_serror_panic+0x7c/0x90                                    
[   50.217001]  do_serror+0x34/0x6c                                             
[   50.217005]  el1h_64_error_handler+0x30/0x50                                 
[   50.217009]  el1h_64_error+0x7c/0x80                                         
[   50.217014]  el1_interrupt+0x20/0x54                                         
[   50.217018]  el1h_64_irq_handler+0x18/0x24                                   
[   50.217022]  el1h_64_irq+0x7c/0x80                                           
[   50.217026]  nvkm_device_ctor+0x1b40/0x3d50 [nouveau]                        
[   50.217535]  nvkm_device_tegra_new+0x2e8/0x3a0 [nouveau]                     
[   50.217891]  nouveau_platform_device_create+0x4c/0x110 [nouveau]             
[   50.218251]  nouveau_platform_probe+0x38/0x90 [nouveau]                      
[   50.218613]  platform_probe+0x74/0xf0                                        
[   50.218619]  really_probe+0xc4/0x470                                         
[   50.218623]  __driver_probe_device+0x11c/0x190                               
[   50.218628]  driver_probe_device+0x48/0x110                                  
[   50.218632]  __device_attach_driver+0xa4/0x140                               
[   50.218637]  bus_for_each_drv+0x74/0xb4                                      
[   50.218641]  __device_attach+0xb8/0x1b0                                      
[   50.218646]  device_initial_probe+0x20/0x30                                  
[   50.218651]  bus_probe_device+0xa4/0xb0                                      
[   50.218655]  deferred_probe_work_func+0xc0/0x114                             
[   50.218659]  process_one_work+0x1f4/0x490                                    
[   50.218668]  worker_thread+0x184/0x500                                       
[   50.218674]  kthread+0x13c/0x140                                             
[   50.218679]  ret_from_fork+0x10/0x20                                         
[   50.218686] ---[ end trace e4cac48e1e506ecc ]--- 


The  kernel-5.17.0-0.rc6.109.fc37.aarch64  get different panic with HDMI plugged in:

[    7.024111]  usb2-1: supply vbus not found, using dummy regulator
[    7.034453]  usb2-2: supply vbus not found, using dummy regulator
[   49.141501] SError Interrupt on CPU1, code 0xbf000002 -- SError              
[   49.141530] CPU: 1 PID: 326 Comm: systemd-udevd Not tainted 5.17.0-0.rc6.109.
fc37.aarch64 #1                                                                 
[   49.141545] Hardware name: nvidia p3450-0000/p3450-0000, BIOS 2020.10 10/06/2
020                                                                             
[   49.141551] pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)  
[   49.141561] pc : el1_interrupt+0x20/0x54                                     
[   49.141587] lr : el1h_64_irq_handler+0x18/0x24                               
[   49.141597] sp : ffff80000bbc3740                                            
[   49.141600] x29: ffff80000bbc3740 x28: ffff0000867c8000 x27: ffff000087491848
[   49.141614] x26: ffff80000141e140 x25: ffff80000141e140 x24: 0000000057000000
[   49.141622] x23: 0000000060400005 x22: ffff8000012e3ab0 x21: 00000000db034000
[   49.141630] x20: ffff800008070190 x19: ffff80000bbc3770 x18: 0000000000000000
[   49.141638] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   49.141645] x14: 0000000000000000 x13: ffff000080436100 x12: ffff80000a011160
[   49.141655] x11: ffff800009e35800 x10: 0000000000000000 x9 : ffff80000838aad0
[   49.141662] x8 : ffff80000dffffff x7 : 0000000057000000 x6 : ffff80000e000000
[   49.141670] x5 : 000000000000003f x4 : 0000000001000000 x3 : 0000000057e00000
[   49.141677] x2 : 0068000057e00f10 x1 : 00000000000000c0 x0 : ffff80000bbc3770
[   49.141692] Kernel panic - not syncing: Asynchronous SError Interrupt        
[   49.141698] CPU: 1 PID: 326 Comm: systemd-udevd Not tainted 5.17.0-0.rc6.109.
fc37.aarch64 #1                                                                 
[   49.141704] Hardware name: nvidia p3450-0000/p3450-0000, BIOS 2020.10 10/06/2
020                                                                             
[   49.141709] Call trace:                                                      
[   49.141714]  dump_backtrace+0xf8/0x130                                       
[   49.141730]  show_stack+0x24/0x70                                            
[   49.141735]  dump_stack_lvl+0x64/0x80                                        
[   49.141752]  dump_stack+0x18/0x34                                            
[   49.141758]  panic+0x134/0x34c                                               
[   49.141767]  nmi_panic+0x98/0xa0                                             
[   49.141779]  arm64_serror_panic+0x7c/0x90                                    
[   49.141787]  do_serror+0x34/0x7c                                             
[   49.141793]  el1h_64_error_handler+0x30/0x50                                 
[   49.141798]  el1h_64_error+0x7c/0x80                                         
[   49.141807]  el1_interrupt+0x20/0x54                                         
[   49.141811]  el1h_64_irq_handler+0x18/0x24                                   
[   49.141816]  el1h_64_irq+0x7c/0x80                                           
[   49.141847] VDD_HDMI_5V0: disabling                                          
[   49.141820]  nvkm_device_ctor+0x1bf0/0x3bd0 [nouveau]                        
[   49.142774]  nvkm_device_tegra_new+0x2e8/0x3a0 [nouveau]                     
[   49.143352]  nouveau_platform_device_create+0x4c/0x110 [nouveau]             
[   49.143892]  nouveau_platform_probe+0x38/0x90 [nouveau]                      
[   49.144548]  platform_probe+0x74/0xd0                                        
[   49.144567]  really_probe+0x1c4/0x440                                        
[   49.144574]  __driver_probe_device+0x11c/0x190                               
[   49.144581]  driver_probe_device+0x48/0x104                                  
[   49.144586]  __driver_attach+0xd8/0x1b0                                      
[   49.144591]  bus_for_each_dev+0x6c/0xb0                                      
[   49.144612]  driver_attach+0x30/0x40                                         
[   49.144616]  bus_add_driver+0x150/0x230                                      
[   49.144620]  driver_register+0x84/0x140                                      
[   49.144625]  __platform_driver_register+0x34/0x40                            
[   49.144632]  nouveau_drm_init+0x1a8/0x1000 [nouveau]                         
[   49.145095]  do_one_initcall+0x40/0x220                                      
[   49.145107]  do_init_module+0x50/0x260                                       
[   49.145122]  load_module+0x930/0xac0                                         
[   49.145129]  __do_sys_init_module+0xec/0x140                                 
[   49.145136]  __arm64_sys_init_module+0x28/0x34                               
[   49.145143]  invoke_syscall+0x50/0x120                                       
[   49.145156]  el0_svc_common.constprop.0+0xd4/0xf4                            
[   49.145163]  do_el0_svc+0x30/0x9c                                            
[   49.145169]  el0_svc+0x28/0xb0                                               
[   49.145180]  el0t_64_sync_handler+0x10c/0x140                                
[   49.145186]  el0t_64_sync+0x1a4/0x1a8                                        
[   49.145199] SMP: stopping secondary CPUs                                     
[   49.147808] Kernel Offset: 0x60000 from 0xffff800008000000                   
[   49.147812] PHYS_OFFSET: 0x80000000                                          
[   49.147815] CPU features: 0x22,000018c2,00000846                             
[   49.147822] Memory Limit: none                                               
[   49.456702] ---[ end Kernel panic - not syncing: Asynchronous SError Interrup
t ]---

Comment 1 Ben Cotton 2022-11-29 17:59:16 UTC
This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 2 Ben Cotton 2022-12-13 16:48:53 UTC
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.