Bug 522201

Summary: kernel-2.6.30.5-43.fc11.x86_64 unbootable on ThinkPad T500
Product: [Fedora] Fedora Reporter: Stephen John Smoogen <smooge>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: high    
Version: 11CC: allisson, amlau, bobpoljakov, emcnabb, itamar, jason, jrickman, kernel-maint, me, smooge
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-04-28 22:39:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
lspci
none
lspci -vvv
none
lspci -vvv run as root
none
Data from DMAR table none

Description Stephen John Smoogen 2009-09-09 18:32:28 UTC
Created attachment 360308 [details]
lspci 

Description of problem:

Upgraded to new kernel and rebooted system. System crashes before initial ram image is started. The only text to the screen is a set of DRM or similar messages.

 http://lkml.indiana.edu/hypermail/linux/kernel/0906.3/00737.html

may be similar.

Version-Release number of selected component (if applicable):
kernel-2.6.30.5-43.fc11.x86_64

How reproducible:
100%


Additional info:

Including lspci as attachments.

Comment 1 Stephen John Smoogen 2009-09-09 18:33:08 UTC
Created attachment 360310 [details]
lspci -vvv

Here is the -vvv as that is usually more helpful.

Comment 2 Stephen John Smoogen 2009-09-09 18:34:16 UTC
Created attachment 360311 [details]
lspci -vvv run as root

Comment 3 Jason Merrill 2009-09-11 22:11:46 UTC
On my Thinkpad T61, booting with kernel-PAE-2.6.30.5-43.fc11.i686 hangs after the fedora icon finishes filling.  After rebooting back to 2.6.29, I needed to run fsck manually to repair some filesystem errors.

Comment 4 Stephen John Smoogen 2009-09-11 23:09:11 UTC
Looked at grub options. I currently have intel_iommu=on for KVM systems. Turning off this option allowed for the system to complete booting but locked up in X as the fedora icon finished loading. Putting that option in stops the system before we get to the fedora icon.

Comment 5 Chris Fleming 2009-09-14 15:58:05 UTC
Different laptop (HP 6930p) and a slightly different kernel 2.6.30.5-43.fc11.i686.PAE but this hangs during the boot.

Comment 6 Jason Merrill 2009-09-15 14:41:43 UTC
Nothing helpful-looking in /var/log/messages from before the hang; the last few lines are

Sep 11 17:31:58 gemini NetworkManager: <info>  (eth0): device state change: 7 -> 8 (reason 0)
Sep 11 17:31:58 gemini NetworkManager: <info>  Policy set 'System eth0' (eth0) as default for routing and DNS.
Sep 11 17:31:58 gemini NetworkManager: <info>  Activation (eth0) successful, device activated.
Sep 11 17:31:58 gemini NetworkManager: <info>  Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.
Sep 11 17:31:58 gemini ntpd[1683]: Listening on interface #4 eth0, fe80::21a:6bff:fed2:c0e8#123 Enabled
Sep 11 17:31:58 gemini ntpd[1683]: Listening on interface #5 eth0, 192.168.1.65#123 Enabled
Sep 11 17:32:04 gemini auditd[1896]: Started dispatcher: /sbin/audispd pid: 1898
Sep 11 17:32:04 gemini auditd[1896]: Init complete, auditd 1.7.13 listening for events (startup state enable)
Sep 11 17:32:04 gemini audispd: af_unix plugin initialized
Sep 11 17:32:04 gemini audispd: audispd initialized with q_depth=80 and 2 active plugins

Comment 7 Stephen John Smoogen 2009-09-29 21:33:39 UTC
Problem still occurs in kernel-2.6.30.8, however I have found that the trigger for an immediate crash is intel_iommu=on flag used for KVM. Turning this off the system runs for a bit (just installed .8 so not sure how long yet.. .5 ran until X froze on it later that day.) Watching the screen and trying to freeze the boot to where it crashes led me to these messages on the console. If intel_iommu is on then we crash as we change the frame buffer. Without it we get the following in dmesg.

----------
[drm] Initialized drm 1.1.0 20060810
i915 0000:00:02.0: power state changed by ACPI to D0
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
i915 0000:00:02.0: setting latency timer to 64
  alloc irq_desc for 29 on cpu 0 node 0
  alloc kstat_irqs on cpu 0 node 0
i915 0000:00:02.0: irq 29 for MSI/MSI-X
allocated 1680x1050 fb: 0x02020000, bo ffff88007a470a80
Console: switching to colour frame buffer device 210x65
[drm] LVDS-8: set mode 1680x1050 11
fb0: inteldrmfb frame buffer device
registered panic notifier
acpi device:03: registered as cooling_device2
input: Video Bus as /devices/LNXSYSTM:00/device:00/PNP0A08:00/device:02/input/input8
ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

-------------

My guess is that the mmu code and the i915 are not interacting as well as they should.

Comment 8 Jason Merrill 2009-11-02 15:32:00 UTC
kernel-PAE-2.6.30.9-90.fc11.i686 boots for me; this is the first 2.6.30 kernel I've tried since the earlier one failed.

Comment 9 Stephen John Smoogen 2009-11-16 19:22:10 UTC
Updated to Fedora 12 after hard-drive failure. System maintained still had problems and dmesg was filled with errors about 

DRHD: handling fault status reg 3
DMAR:[DMA Read] Request device [00:1b.0] fault addr 0
DMAR:[fault reason 06] PTE Read access is not set

I also noticed new hard-drive was making clicking noises. Changed /etc/grub.conf to allow for iommu=soft

kernel /vmlinuz-2.6.31.5-127.fc12.x86_64 ro root=/dev/mapper/vg_bakeneko-lv_root  LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rhgb quiet iommu=soft

Rebooted and hard-drive clicking went away, and DRHD errors went away. Looking through some other logs I am going to attach the acpi tables DMAR file in case there is something there people can use.

sudo cat /sys/firmware/acpi/tables/DMAR > /tmp/DMAR

Comment 10 Stephen John Smoogen 2009-11-16 19:23:16 UTC
Created attachment 369772 [details]
Data from DMAR table

Taken from Thinkpad T500 with 2GB of ram.

Comment 11 Bug Zapper 2010-04-28 10:15:34 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 12 Stephen John Smoogen 2010-04-28 22:39:25 UTC
This problem went away with the 2.6.31/32 kernels and is not a problem with F12.