Bug 1389807

Summary: Attempts to reboot/poweroff die with "unable to handle kernel NULL pointer dereference at 0000000000000fe8"
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal.jnn>
Component: xorg-x11-drv-atiAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, ichavero, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-30 22:17:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
dmesg output after an attetmpt to 'reboot -f' with kernel-4.9.0-0.rc2.git1.1.fc26.x86_64 none

Description Michal Jaegermann 2016-10-28 17:21:44 UTC
Created attachment 1215055 [details]
dmesg output after an attetmpt to 'reboot -f' with kernel-4.9.0-0.rc2.git1.1.fc26.x86_64

Description of problem:
Kernels from 4.9.0 series make it impossible to reboot or poweroff.  After issuing such command I getting a blank screen and an unresponsive machine which can be only switched off by cutting power.  Nothing recorde in journal which would suggest what the problem might be.

An attempt to 'poweoff -f' or 'reboot -f' produces on a terminal:

Rebooting.
Killed

followed by a shell prompt and the following backtrace in dmesg:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000fe8
 IP: [<ffffffffc049462c>] radeon_connector_unregister+0xc/0x40 [radeon]
 PGD 0 

 Oops: 0000 [#1] SMP
 Modules linked in: cfg80211 rfkill hwmon_vid ppdev powernow_k8 edac_mce_amd edac_core snd_via82xx snd_mpu401_uart gameport snd_rawmidi snd_ac97_codec k8temp ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore i2c_viapro parport_pc parport shpchp acpi_cpufreq tpm_tis tpm_tis_core tpm binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc amdkfd amd_iommu_v2 radeon ata_generic pata_acpi i2c_algo_bit drm_kms_helper ttm drm 8139too 8139cp serio_raw mii pata_via sata_via fjes uas usb_storage
 CPU: 0 PID: 1425 Comm: reboot Not tainted 4.9.0-0.rc2.git1.1.fc26.x86_64 #1
 Hardware name: Acer Aspire T135/K8VM800MAE, BIOS R01-A3 06/27/2005
 task: ffffa0259a75b100 task.stack: ffffbdbc818ac000
 RIP: 0010:[<ffffffffc049462c>]  [<ffffffffc049462c>] radeon_connector_unregister+0xc/0x40 [radeon]
 RSP: 0018:ffffbdbc818afca8  EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffffa0259898d000 RCX: 0000000000000006
 RDX: 0000000000000006 RSI: ffffa0259a75be18 RDI: ffffa0259898d000
 RBP: ffffbdbc818afcb8 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0259898a988
 R13: ffffa0259898a000 R14: ffffa0259d339100 R15: 00000000fee1dead
 FS:  00007f08c09d0280(0000) GS:ffffa0259fc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000fe8 CR3: 0000000035a69000 CR4: 00000000000006f0
 Stack:
  ffffffffc03dfd0f ffffa0259898d000 ffffbdbc818afcd8 ffffffffc03e0400
  ffffa0259898a000 ffffa0259898a9d0 ffffbdbc818afd00 ffffffffc03d00cd
  ffffa0259898a000 ffffa0259898a000 ffffffffc05ae000 ffffbdbc818afd28
 Call Trace:
  [<ffffffffc03dfd0f>] ? drm_connector_unregister.part.6+0x1f/0x40 [drm]
  [<ffffffffc03e0400>] drm_connector_unregister_all+0x40/0x60 [drm]
  [<ffffffffc03d00cd>] drm_modeset_unregister_all+0x1d/0x70 [drm]
  [<ffffffffc03cacc6>] drm_dev_unregister+0xb6/0xc0 [drm]
  [<ffffffffc03cb382>] drm_put_dev+0x32/0x60 [drm]
  [<ffffffffc046e305>] radeon_pci_shutdown+0x15/0x20 [radeon]
  [<ffffffff8c4ca476>] pci_device_shutdown+0x36/0x70
  [<ffffffff8c5ea2d0>] device_shutdown+0xe0/0x1e0
  [<ffffffff8c0dbb66>] kernel_restart_prepare+0x36/0x40
  [<ffffffff8c0dbc12>] kernel_restart+0x12/0x60
  [<ffffffff8c0dbf98>] SYSC_reboot+0x208/0x220
  [<ffffffff8c187f0d>] ? __audit_syscall_entry+0xad/0xf0
  [<ffffffff8c111175>] ? trace_hardirqs_on_caller+0xf5/0x1b0
  [<ffffffff8c11123d>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff8c187f0d>] ? __audit_syscall_entry+0xad/0xf0
  [<ffffffff8c187f0d>] ? __audit_syscall_entry+0xad/0xf0
  [<ffffffff8c003509>] ? syscall_trace_enter+0x1d9/0x3a0
  [<ffffffff8c111175>] ? trace_hardirqs_on_caller+0xf5/0x1b0
  [<ffffffff8c0dbffe>] SyS_reboot+0xe/0x10
  [<ffffffff8c003eec>] do_syscall_64+0x6c/0x1f0
  [<ffffffff8c90ca49>] entry_SYSCALL64_slow_path+0x25/0x25
 Code: 9b 3d de cb 48 89 df e8 23 b7 f4 ff 48 89 df e8 3b b7 f4 ff 48 89 df e8 83 3d de cb 5b 5d c3 66 66 66 66 90 48 8b 87 a0 03 00 00 <80> b8 e8 0f 00 00 00 75 01 c3 55 48 89 e5 53 48 89 fb 48 8d b8 
 RIP  [<ffffffffc049462c>] radeon_connector_unregister+0xc/0x40 [radeon]
  RSP <ffffbdbc818afca8>
 CR2: 0000000000000fe8
 ---[ end trace 5b24c09c72caa76e ]---

After such bomb a second attempt to poweroff/reboot immediately locks up the whole system, i.e. a power switch is required to continue and nothing new
recorded in journal.

Version-Release number of selected component (if applicable):
kernel-4.9.0-0.rc2.git1.1.fc26.x86_64

How reproducible:
always, just try to reboot

Expected results:
Hm, a machine reboots or powers off; like it used to do in the past.

Additional info:
Unfortunately through practically all November of 2016 I will be away and without an access to my test machine.  Hopefuly this can be observed also on another hardware.

Comment 1 Michal Jaegermann 2016-11-30 22:17:34 UTC
At last I had an oportunity to try with kernel-4.9.0-0.rc6.git2.1.fc26.x86_64.  Two attempts and no crash on a reboot/poweroff.  The reported problem was happening always so, I guess, that means that the issue was fixed. Thanks!