Bug 1583795 - system crashed or became unresponsive
Summary: system crashed or became unresponsive
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-29 17:37 UTC by bharatt hareindharan
Modified: 2018-11-30 23:40 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-11-30 23:40:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
system logs (41.19 KB, application/x-gzip)
2018-05-29 17:37 UTC, bharatt hareindharan
no flags Details
system logs (388.25 KB, text/plain)
2018-06-03 19:44 UTC, bharatt hareindharan
no flags Details

Description bharatt hareindharan 2018-05-29 17:37:25 UTC
Created attachment 1445477 [details]
system logs

Description of problem:

System became totally unresponsive to any keyboard or mouse signals. System all of a sudden gets into completely frozen state, and only way to bring back the system is to do a hard reboot (press the reboot button on the cabinet).

Version-Release number of selected component (if applicable):

Kernel - 4.16.11-200.fc27.x86_64

NOTE:
   Am not quite sure about the component which has been selected since the entire system freezes up.

How reproducible:

Cannot be reproduced as it is very unpredictable when system will get into total unresponsive state.

This time, it happened while I was browsing using firefox with ~5 tabs open.

System has been running when I have used firefox with 20+ tabs.

This type of crash is happening when I upgraded my system from F24 to F27.

Steps to Reproduce:
1. Cannot be reproduced.
2.
3.

Actual results:

The system just freezes up and does not accept any signals from keyboard or mouse. Have to go for hard reboot of the system.

System is not in any high load when this event happens.

I am not able to see if the system actually gets loaded up at this point in time because none of the signals from the keyboard works.

Expected results:

System to run flawlessly forever 


Additional info:

I tried to configure kdump to find what would be actually rendering this system totally unresponsive, but unfortunately "kdump" service fails to start for which I have raised a bug ticket "Bug 1583482".

Have attached system logs, which starts from system being suspended on "May 29 @21:27:00" and at "May 30 @02:15" system crashes.

Kindly got through the system logs and help in finding out what could be actually happening.

NOTE:
- My system is running on AMD Ryzen 1700 processor. F24 was running flawlessly on this processor and never experienced such crashes. But with F27 or F28 am experiencing unpredictable crashes.

- I have another system at office running F27 on INTEL processor (not quite remembering the arch) and it runs fine without any crashes.

Comment 1 bharatt hareindharan 2018-06-02 05:34:24 UTC
Today again the system went into unresponsive state soon after it was activated from "suspension".

Jun 02 13:07:17 vibha.fedora.home systemd[1]: Reached target Sleep.
Jun 02 13:07:17 vibha.fedora.home systemd[1]: Starting Suspend...
Jun 02 13:07:17 vibha.fedora.home systemd-sleep[11916]: Suspending system...
Jun 02 13:07:17 vibha.fedora.home kernel: PM: suspend entry (deep)

-- Reboot -- [ # Manually reboot since system went into unresponsive state ]

Jun 02 15:04:37 vibha.fedora.home kernel: Linux version 4.16.12-200.fc27.x86_64 (mockbuild.fedoraproject.org) (gcc version 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC)) #1 SMP Fri May 25 21:10:16 UTC 2018
Jun 02 15:04:37 vibha.fedora.home kernel: Command line: BOOT_IMAGE=/vmlinuz-4.16.12-200.fc27.x86_64 root=/dev/mapper/sysvg-root ro crashkernel=auto rd.lvm.lv=sysvg/root rd.lvm.lv=sysvg/swap rhgb quiet
Jun 02 15:04:37 vibha.fedora.home kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Jun 02 15:04:37 vibha.fedora.home kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Jun 02 15:04:37 vibha.fedora.home kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Jun 02 15:04:37 vibha.fedora.home kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
Jun 02 15:04:37 vibha.fedora.home kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.

Comment 2 bharatt hareindharan 2018-06-02 05:40:17 UTC
not sure what is making the system so unstable as it has to be hard reboot on an average once in 3 days.

15:36:15 $ last reboot
reboot   system boot  4.16.12-200.fc27 Sat Jun  2 15:04   still running
reboot   system boot  4.16.12-200.fc27 Thu May 31 07:55   still running <=== This is due to system getting into unresponsive state, and does not see the manual reboot which has been done. If the system is rebooted gracefully, then this entry will get changed.
reboot   system boot  4.16.11-200.fc27 Wed May 30 02:22 - 07:55 (1+05:33)
reboot   system boot  4.16.11-200.fc27 Mon May 28 03:35 - 07:55 (3+04:20)
reboot   system boot  4.16.9-200.fc27. Thu May 24 22:29 - 03:34 (3+05:04)
reboot   system boot  4.16.7-200.fc27. Thu May 24 22:05 - 22:28  (00:23) <=== System went into unresponsive state and had to be manually rebooted.
reboot   system boot  4.16.7-200.fc27. Tue May 22 06:04 - 22:28 (2+16:24)
reboot   system boot  4.16.7-200.fc27. Mon May 21 02:40 - 06:04 (1+03:23)
reboot   system boot  4.16.7-200.fc27. Tue May 15 05:10 - 06:04 (7+00:53)
reboot   system boot  4.13.9-300.fc27. Tue May 15 03:51 - 05:08  (01:16) <=== System went into unresponsive state and had to be manually rebooted.

Comment 3 bharatt hareindharan 2018-06-03 18:33:47 UTC
system crashed again and it is happening so frequently. Never experienced fedora crashing like this.

$ uptime
 04:29:26 up 1 min,  1 user,  load average: 2.15, 0.82, 0.30

$ last reboot
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 04:28   still running
reboot   system boot  4.16.12-200.fc27 Sat Jun  2 15:52   still running <=== This is due to system getting into unresponsive state, and does not see the manual reboot which has been done. If the system is rebooted gracefully, then this entry will get changed.
reboot   system boot  4.16.12-200.fc27 Sat Jun  2 15:04 - 15:51  (00:47)
reboot   system boot  4.16.12-200.fc27 Thu May 31 07:55 - 15:51 (2+07:55)
reboot   system boot  4.16.11-200.fc27 Wed May 30 02:22 - 07:55 (1+05:33)
reboot   system boot  4.16.11-200.fc27 Mon May 28 03:35 - 07:55 (3+04:20)
reboot   system boot  4.16.9-200.fc27. Thu May 24 22:29 - 03:34 (3+05:04)
reboot   system boot  4.16.7-200.fc27. Thu May 24 22:05 - 22:28  (00:23)
reboot   system boot  4.16.7-200.fc27. Tue May 22 06:04 - 22:28 (2+16:24)
reboot   system boot  4.16.7-200.fc27. Mon May 21 02:40 - 06:04 (1+03:23)
reboot   system boot  4.16.7-200.fc27. Tue May 15 05:10 - 06:04 (7+00:53)
reboot   system boot  4.13.9-300.fc27. Tue May 15 03:51 - 05:08  (01:16)

Can this issue be looked into. Request for some response.

Comment 4 bharatt hareindharan 2018-06-03 19:44:30 UTC
Created attachment 1447263 [details]
system logs

Have attached system logs, from the time the system was last suspended.

Comment 5 Justin M. Forbes 2018-07-23 15:09:16 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.

Fedora 27 has now been rebased to 4.17.7-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28.

If you experience different issues, please open a new bug report for those.

Comment 6 Laura Abbott 2018-10-01 21:24:11 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.
 
Fedora 27 has now been rebased to 4.18.10-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 28 or Fedora 29, and are still experiencing this issue, please change the version to Fedora 28 or 29.
 
If you experience different issues, please open a new bug report for those.

Comment 7 bharatt hareindharan 2018-11-05 20:37:19 UTC
 sarvesh @ vibha [ 127.0.0.1-2018-11-05-11:46:24 ] 07:07:36 $ last reboot
reboot   system boot  4.18.15-100.fc27 Mon Nov  5 11:47   still running  
                                  |
                                  |
              There are 2 running status because system crashed on Nov 5
              due to:
              Kernel panic - not syncing: NMI: Not continuing
                                  |
              - System generated "vmcore" file when it crashed.
              - "kvm_amd" module was loaded from "Oct 24"
              - System/Motherboard BIOS FW was updated to the latest 
                "4023" version, released on "08/20/2018" for the model
                "ASUS PRIME B350M-A"
                                  |
reboot   system boot  4.18.15-100.fc27 Wed Oct 24 21:49   still running
                                  |
                                  |
                                  |
                                  |
              But during this time "kvm_amd" was not loaded.
              System was used like a very normal desktop.
              System was rebooted gracefully once after kernel
              was updated to 4.18.15-100.fc27.
                                  |
                                  |
                                  |
                                  |
reboot   system boot  4.17.19-100.fc27 Sat Sep 22 22:13 - 21:45 (31+22:32) <== Highest number of days system ran without crash.
                                                                               

reboot   system boot  4.17.19-100.fc27 Sat Sep 22 21:40 - 22:08  (00:27)
reboot   system boot  4.17.19-100.fc27 Sat Sep 22 21:38 - 21:40  (00:01)
reboot   system boot  4.17.19-100.fc27 Mon Sep 17 20:46 - 21:40 (5+00:54)
reboot   system boot  4.17.19-100.fc27 Sun Sep 16 15:15 - 21:40 (6+06:24)
reboot   system boot  4.17.19-100.fc27 Sun Sep 16 15:01 - 15:15  (00:13)
reboot   system boot  4.17.19-100.fc27 Sun Sep 16 06:11 - 14:52  (08:41)
reboot   system boot  4.17.17-100.fc27 Sun Sep 16 05:12 - 06:05  (00:52)
reboot   system boot  4.17.17-100.fc27 Sun Sep 16 05:06 - 05:11  (00:05)
reboot   system boot  4.17.17-100.fc27 Sun Sep  2 08:52 - 05:11 (13+20:19)
reboot   system boot  4.17.12-100.fc27 Sun Sep  2 06:18 - 08:51  (02:32)
reboot   system boot  4.17.12-100.fc27 Sat Sep  1 21:55 - 08:51  (10:56)
reboot   system boot  4.17.12-100.fc27 Sat Sep  1 21:50 - 21:54  (00:03)
reboot   system boot  4.17.12-100.fc27 Tue Aug 14 05:59 - 21:54 (18+15:54)
reboot   system boot  4.17.12-100.fc27 Tue Aug 14 05:33 - 05:58  (00:24)
reboot   system boot  4.17.7-100.fc27. Tue Aug 14 04:52 - 05:28  (00:35)
reboot   system boot  4.17.3-100.fc27. Thu Jul 26 07:50 - 05:28 (18+21:37)
reboot   system boot  4.17.3-100.fc27. Thu Jul 26 07:48 - 07:50  (00:01)
reboot   system boot  4.17.3-100.fc27. Fri Jul  6 03:50 - 07:50 (20+03:59)
reboot   system boot  4.16.12-200.fc27 Wed Jul  4 09:09 - 03:49 (1+18:39)
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 20:19 - 03:49 (31+07:30)
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 20:13 - 20:17  (00:03)
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 05:19 - 19:52  (14:33)
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 05:14 - 19:52  (14:38)
reboot   system boot  4.16.12-200.fc27 Mon Jun  4 04:28 - 05:13  (00:45)
reboot   system boot  4.16.12-200.fc27 Sat Jun  2 15:52 - 05:13 (1+13:21)
reboot   system boot  4.16.12-200.fc27 Sat Jun  2 15:04 - 15:51  (00:47)
reboot   system boot  4.16.12-200.fc27 Thu May 31 07:55 - 15:51 (2+07:55)
reboot   system boot  4.16.11-200.fc27 Wed May 30 02:22 - 07:55 (1+05:33)
reboot   system boot  4.16.11-200.fc27 Mon May 28 03:35 - 07:55 (3+04:20)
reboot   system boot  4.16.9-200.fc27. Thu May 24 22:29 - 03:34 (3+05:04)
reboot   system boot  4.16.7-200.fc27. Thu May 24 22:05 - 22:28  (00:23)
reboot   system boot  4.16.7-200.fc27. Tue May 22 06:04 - 22:28 (2+16:24)
reboot   system boot  4.16.7-200.fc27. Mon May 21 02:40 - 06:04 (1+03:23)
reboot   system boot  4.16.7-200.fc27. Tue May 15 05:10 - 06:04 (7+00:53)
reboot   system boot  4.13.9-300.fc27. Tue May 15 03:51 - 05:08  (01:16)

wtmp begins Tue May 15 03:51:48 2018
=================================================================

vmcore file details:

 sarvesh @ vibha [ crash ] 07:25:22 $ ll -h /var/crash/127.0.0.1-2018-11-05-11\:46\:24/vmcore
-rw-------. 1 root root 933M Nov  5 11:46 /var/crash/127.0.0.1-2018-11-05-11:46:24/vmcore

      KERNEL: /usr/lib/debug/lib/modules/4.18.15-100.fc27.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2018-11-05-11:46:24/vmcore  [PARTIAL DUMP]
        CPUS: 16
        DATE: Mon Nov  5 11:46:13 2018
      UPTIME: 8 days, 11:06:34
LOAD AVERAGE: 0.23, 0.15, 0.10
       TASKS: 1056
    NODENAME: vibha.fedora.home
     RELEASE: 4.18.15-100.fc27.x86_64
     VERSION: #1 SMP Thu Oct 18 18:19:00 UTC 2018
     MACHINE: x86_64  (2994 Mhz)
      MEMORY: 63.9 GB
       PANIC: "Kernel panic - not syncing: NMI: Not continuing"
         PID: 0
     COMMAND: "swapper/1"
        TASK: ffff94ad18170000  (1 of 16)  [THREAD_INFO: ffff94ad18170000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

There were 2 kvms running (not sure why it is listing 4 instances), but do not think they are any close to causing this 
unexpected crash reboot of the system.

crash> ps | grep 'PID\|qemu'
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
  17318      1   0  ffff94ad11509f00  IN   0.7 4134332 442348  qemu-system-x86
  17324      1   8  ffff94acdc17dd00  IN   0.7 4134332 442348  qemu-system-x86
  17345      1  14  ffff94acb0c7dd00  IN   0.6 4003256 414036  qemu-system-x86
  17350      1  14  ffff94ac7cd33e00  IN   0.6 4003256 414036  qemu-system-x86

Comment 8 bharatt hareindharan 2018-11-05 20:40:12 UTC
crash> bt
PID: 0      TASK: ffff94ad18170000  CPU: 1   COMMAND: "swapper/1"
 #0 [fffffe0000034d10] machine_kexec at ffffffff9805bc2e
 #1 [fffffe0000034d68] __crash_kexec at ffffffff9814e861
 #2 [fffffe0000034e28] panic at ffffffff980aece2
 #3 [fffffe0000034eb0] nmi_panic at ffffffff980ae8b5
 #4 [fffffe0000034eb8] unknown_nmi_error at ffffffff9802a9af
 #5 [fffffe0000034ed0] do_nmi at ffffffff9802ac52
 #6 [fffffe0000034ef0] end_repeat_nmi at ffffffff98a014d8
    [exception RIP: acpi_idle_do_entry+21]
    RIP: ffffffff989123e5  RSP: ffffa6d48636be48  RFLAGS: 00000093
    RAX: 0000000000000000  RBX: ffff94ad17b4a000  RCX: 0000000000000068
    RDX: 0000000000000414  RSI: ffffffff992dac60  RDI: ffff94ad17b4a098
    RBP: ffff94ad17b4a098   R8: 00000000000e691d   R9: 00000000ffffffff
    R10: ffffa6d48636be78  R11: 00000000000cf6cf  R12: 0000000000000002
    R13: 0000000000000002  R14: ffffffff992dac60  R15: 0002990571309b5e
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #7 [ffffa6d48636be48] acpi_idle_do_entry at ffffffff989123e5
 #8 [ffffa6d48636be48] acpi_idle_enter at ffffffff98534217
 #9 [ffffa6d48636be90] cpuidle_enter_state at ffffffff9875caee
#10 [ffffa6d48636bed0] do_idle at ffffffff980e2654
#11 [ffffa6d48636bf10] cpu_startup_entry at ffffffff980e289f
#12 [ffffa6d48636bf30] start_secondary at ffffffff98050f57
#13 [ffffa6d48636bf50] secondary_startup_64 at ffffffff980000d5

Comment 9 bharatt hareindharan 2018-11-05 20:41:25 UTC
 sarvesh @ vibha [ 127.0.0.1-2018-11-05-11:46:24 ] 07:11:47 $ cat /etc/sysctl.d/10-panic.conf 
kernel.sysrq = 1
vm.panic_on_oom = 1
kernel.softlockup_panic = 1
kernel.panic_on_unrecovered_nmi = 1
kernel.unknown_nmi_panic = 1
kernel.panic = 1
kernel.panic_on_io_nmi = 1
kernel.panic_on_oops = 1
kernel.panic_on_stackoverflow = 1

Comment 10 Ben Cotton 2018-11-27 13:38:53 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Ben Cotton 2018-11-30 23:40:12 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.