On Intel S5520SC workstation: 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22) 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22) 00:10.0 PIC: Intel Corporation 7500/5520/5500/X58 Physical and Link Layer Registers Port 0 (rev 22) 00:10.1 PIC: Intel Corporation 7500/5520/5500/X58 Routing and Protocol Layer Registers Port 0 (rev 22) 00:11.0 PIC: Intel Corporation 7500/5520/5500 Physical and Link Layer Registers Port 1 (rev 22) 00:11.1 PIC: Intel Corporation 7500/5520/5500 Routing & Protocol Layer Register Port 1 (rev 22) 00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22) 00:14.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub System Management Registers (rev 22) 00:14.1 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22) 00:14.2 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22) 00:14.3 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle Registers (rev 22) 00:15.0 PIC: Intel Corporation 7500/5520/5500/X58 Trusted Execution Technology Registers (rev 22) 00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller 00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1 00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5 00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 6 00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2 01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) 01:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) 07:00.0 PCI bridge: PLX Technology, Inc. PEX8112 x1 Lane PCI Express-to-PCI Bridge (rev aa) 07:01.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] 08:00.0 VGA compatible controller: NVIDIA Corporation G98 [GeForce 8400 GS Rev. 2] (rev a1) fe:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers (rev 02) fe:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder (rev 02) fe:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02) fe:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 (rev 02) fe:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 0 (rev 02) fe:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 1 (rev 02) fe:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02) fe:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 (rev 02) fe:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers (rev 02) fe:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder (rev 02) fe:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller RAS Registers (rev 02) fe:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers (rev 02) fe:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control (rev 02) fe:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address (rev 02) fe:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank (rev 02) fe:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control (rev 02) fe:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control (rev 02) fe:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address (rev 02) fe:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank (rev 02) fe:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control (rev 02) fe:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control (rev 02) fe:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address (rev 02) fe:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank (rev 02) fe:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control (rev 02) ff:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers (rev 02) ff:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder (rev 02) ff:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02) ff:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 (rev 02) ff:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 0 (rev 02) ff:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 1 (rev 02) ff:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02) ff:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 (rev 02) ff:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers (rev 02) ff:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder (rev 02) ff:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller RAS Registers (rev 02) ff:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers (rev 02) ff:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control (rev 02) ff:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address (rev 02) ff:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank (rev 02) ff:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control (rev 02) ff:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control (rev 02) ff:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address (rev 02) ff:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank (rev 02) ff:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control (rev 02) ff:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control (rev 02) ff:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address (rev 02) ff:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank (rev 02) ff:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control (rev 02) kernel-3.12.6-200.fc19.x86_64 keeps rebooting itself: reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 08:04 - 08:07 (00:03) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 08:01 - 08:07 (00:06) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:58 - 08:07 (00:09) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:56 - 08:07 (00:11) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:53 - 08:07 (00:14) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:50 - 08:07 (00:17) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:47 - 08:07 (00:19) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:45 - 08:07 (00:22) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:42 - 08:07 (00:25) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:39 - 08:07 (00:28) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:36 - 08:07 (00:31) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:31 - 08:07 (00:36) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:28 - 08:07 (00:39) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:25 - 08:07 (00:42) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:22 - 08:07 (00:45) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:19 - 08:07 (00:48) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:17 - 08:07 (00:50) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:14 - 08:07 (00:53) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:11 - 08:07 (00:56) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:08 - 08:07 (00:59) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:05 - 08:07 (01:02) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:03 - 08:07 (01:04) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 07:00 - 08:07 (01:07) reboot system boot 3.12.6-200.0.fc1 Tue Dec 24 06:57 - 08:07 (01:10) kernel-3.12.5-201.fc19.x86_64 is OK.
Odd. Are there any clues in /var/log/messages? Does the machine oops and then gets rebooted? There are not all too many patches between the two kernels: git lg v3.12.5..v3.12.6 |wc -l 119 Any chance you could bisect this?
Created attachment 841682 [details] /var/log/messages Here is /var/log/messages during rebooting period.
(In reply to Michele Baldessari from comment #1) > There are not all too many patches between the two kernels: > git lg v3.12.5..v3.12.6 |wc -l > 119 > > Any chance you could bisect this? I will see what I can do after New Year.
I've taken a short look. Interestingly after the first start with 3.12.6 kernel the system lasted a couple of hours. Afterwards we only get 2/3 minutes: Dec 23 12:22:44 version 3.12.6-200.0.fc19.x86_64 <-- First boot Dec 23 16:36:47 version 3.12.6-200.0.fc19.x86_64 <-- >2 hours Dec 23 16:39:13 version 3.12.6-200.0.fc19.x86_64 <-- 2/3 minutes from now on Dec 23 16:41:54 version 3.12.6-200.0.fc19.x86_64 Dec 23 16:44:42 version 3.12.6-200.0.fc19.x86_64 Dec 23 16:47:31 version 3.12.6-200.0.fc19.x86_64 The messages before the reboot are all similar. Chances are that either: - systemd[1]: Started Load Kernel Modules. - systemd[1]: Starting Apply Kernel Variables... Have something to do with this. The ioatdma warning is present with 3.12.5 as well so can probably be ignored. Either we can try to bisect this or we can collect a crashdump (https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes) in order to see what is going on here. I took another look at the commits between 3.12.5 and 3.12.6 but nothing stood out. Let me know if you need a hand with crash or via bisecting (crash is probably quicker to at least isolate the issue) regards, Michele
(In reply to Michele Baldessari from comment #4) > I've taken a short look. Interestingly after the first start with 3.12.6 > kernel > the system lasted a couple of hours. Afterwards we only get 2/3 minutes: > Dec 23 12:22:44 version 3.12.6-200.0.fc19.x86_64 <-- First boot > Dec 23 16:36:47 version 3.12.6-200.0.fc19.x86_64 <-- >2 hours Machine was under heavy load just before the first reboot. > Dec 23 16:39:13 version 3.12.6-200.0.fc19.x86_64 <-- 2/3 minutes from now on > Dec 23 16:41:54 version 3.12.6-200.0.fc19.x86_64 > Dec 23 16:44:42 version 3.12.6-200.0.fc19.x86_64 > Dec 23 16:47:31 version 3.12.6-200.0.fc19.x86_64 > > The messages before the reboot are all similar. Chances are that either: > - systemd[1]: Started Load Kernel Modules. > - systemd[1]: Starting Apply Kernel Variables... Can I find out what exactly they are doing? > Have something to do with this. > > The ioatdma warning is present with 3.12.5 as well so can probably be > ignored. Yes, it can be ignored. > Either we can try to bisect this or we can collect a crashdump > (https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes) in > order to see what is going on here. I am not sure if kdump will help here since there was no kernel message when it happened. > I took another look at the commits between 3.12.5 and 3.12.6 but nothing > stood out. > Let me know if you need a hand with crash or via bisecting (crash is > probably quicker to at least isolate > the issue) I will try bisect after New Year. But it may take a while to trigger the first reboot.
(In reply to H.J. Lu from comment #5) > (In reply to Michele Baldessari from comment #4) > > I've taken a short look. Interestingly after the first start with 3.12.6 > > kernel > > the system lasted a couple of hours. Afterwards we only get 2/3 minutes: > > Dec 23 12:22:44 version 3.12.6-200.0.fc19.x86_64 <-- First boot > > Dec 23 16:36:47 version 3.12.6-200.0.fc19.x86_64 <-- >2 hours > > Machine was under heavy load just before the first reboot. Ah ok. Could be related yes. > > Dec 23 16:39:13 version 3.12.6-200.0.fc19.x86_64 <-- 2/3 minutes from now on > > Dec 23 16:41:54 version 3.12.6-200.0.fc19.x86_64 > > Dec 23 16:44:42 version 3.12.6-200.0.fc19.x86_64 > > Dec 23 16:47:31 version 3.12.6-200.0.fc19.x86_64 > > > > The messages before the reboot are all similar. Chances are that either: > > - systemd[1]: Started Load Kernel Modules. > > - systemd[1]: Starting Apply Kernel Variables... > > Can I find out what exactly they are doing? Now that I looked more closely the first one is for loading modules statically (man modules-load.d and man systemd-modules-load). And at least here it is unconfigured. The second one parses /etc/sysctl.d (man sysctl.d) > > Have something to do with this. > > > > The ioatdma warning is present with 3.12.5 as well so can probably be > > ignored. > > Yes, it can be ignored. > > > Either we can try to bisect this or we can collect a crashdump > > (https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes) in > > order to see what is going on here. > > I am not sure if kdump will help here since there was no kernel > message when it happened. Oh so also on screen there was no visual feedback whatsoever? Unless for some reason it is not shown, then I agree a crash dump will do little. > > I took another look at the commits between 3.12.5 and 3.12.6 but nothing > > stood out. > > Let me know if you need a hand with crash or via bisecting (crash is > > probably quicker to at least isolate > > the issue) > > I will try bisect after New Year. But it may take a while to trigger > the first reboot. Ok, thanks for your help here.
(In reply to Michele Baldessari from comment #6) > > > > > > The messages before the reboot are all similar. Chances are that either: > > > - systemd[1]: Started Load Kernel Modules. > > > - systemd[1]: Starting Apply Kernel Variables... > > > > Can I find out what exactly they are doing? > > Now that I looked more closely the first one is for loading modules > statically (man modules-load.d and man systemd-modules-load). And at least > here it is unconfigured. > > The second one parses /etc/sysctl.d (man sysctl.d) Why does OS do that after running for more than 2 hours? > > Oh so also on screen there was no visual feedback whatsoever? I couldn't tell since machine was rebooted :-(.
(In reply to H.J. Lu from comment #7) > (In reply to Michele Baldessari from comment #6) > > > > > > > > The messages before the reboot are all similar. Chances are that either: > > > > - systemd[1]: Started Load Kernel Modules. > > > > - systemd[1]: Starting Apply Kernel Variables... > > > > > > Can I find out what exactly they are doing? > > > > Now that I looked more closely the first one is for loading modules > > statically (man modules-load.d and man systemd-modules-load). And at least > > here it is unconfigured. > > > > The second one parses /etc/sysctl.d (man sysctl.d) > > Why does OS do that after running for more than 2 hours? Not sure I saw that it was run after two hours. Do you have a specific timestamp in mind? There are some slight oddities in the timings but I think that is dmesg vs syslog vs journal > > > > Oh so also on screen there was no visual feedback whatsoever? > > I couldn't tell since machine was rebooted :-(. Ah ok. If you are in front of the machine and see it reboot and it is an oops, feel free to take a picture and upload it here
I will try to get more info after New Year.
Just adding the needinfo flag as a reminder
Several machines share one set of monitor/USB keyboard. When it was connected to monitor/USB keyboard, kernel 3.12.6 ran fine. After I unplugged monitor/USB keyboard to use them on another machine, the headless machine rebooted itself without any messages almost immediately: Jan 10 14:52:07 gnu-mic-2 systemd-logind[662]: Removed session 19. Jan 10 14:55:59 gnu-mic-2 kernel: [111751.236673] usb 6-1: USB disconnect, device number 2 Jan 10 14:55:59 gnu-mic-2 kernel: [111751.236679] usb 6-1.1: USB disconnect, device number 3 Jan 10 14:55:59 gnu-mic-2 acpid: input device has been disconnected, fd 10 Jan 10 14:55:59 gnu-mic-2 acpid: input device has been disconnected, fd 11 Jan 10 14:58:57 gnu-mic-2 rsyslogd: [origin software="rsyslogd" swVersion="7.2.6" x-pid="649" x-info="http://www.rsyslog.com"] start Jan 10 14:58:57 gnu-mic-2 kernel: [ 0.000000] Initializing cgroup subsys cpuset Jan 10 14:58:57 gnu-mic-2 kernel: [ 0.000000] Initializing cgroup subsys cpu Jan 10 14:58:57 gnu-mic-2 kernel: [ 0.000000] Initializing cgroup subsys cpuacct It kept rebooting until I plugged in monitor/USB keyboard. Will 3.11/3.12 kernels reboot without monitor/USB keyboard in some cases?
It seems to be a nouveau driver bug which is fixed in kernel 3.12.9-200.fc19.x86_64. Now I got [ 8410.941640] usb 6-1: USB disconnect, device number 2 [ 8410.941647] usb 6-1.1: USB disconnect, device number 3 [ 8426.407362] pci_pm_runtime_suspend(): nouveau_pmops_runtime_suspend+0x0/0xd0 [nouveau] returns -22 when I unplugged the USB keyboard, instead of reboot. Can someone confirm that it is a known bug fixed in 3.12.9?
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.13.5-100.fc19. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.