Bug 513528 - Rawhide kernel fails to boot on Dell Studio XPS
Summary: Rawhide kernel fails to boot on Dell Studio XPS
Keywords:
Status: CLOSED DUPLICATE of bug 521322
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kyle McMartin
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-24 01:41 UTC by Rodd Clarkson
Modified: 2015-09-01 03:53 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-10 10:54:35 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Trace that initially appears. I can't scroll so I can't get any more than this. (1.37 MB, image/jpeg)
2009-07-24 01:41 UTC, Rodd Clarkson
no flags Details
Backtrace that appears after booting with added pci=nomsi (1.20 MB, image/jpeg)
2009-07-24 01:57 UTC, Rodd Clarkson
no flags Details
More backtrace that appears after a while (1.27 MB, image/jpeg)
2009-07-24 02:00 UTC, Rodd Clarkson
no flags Details
Since image with crash output pasted from numerous photos (398.47 KB, image/jpeg)
2009-07-28 04:36 UTC, Rodd Clarkson
no flags Details
make smscore use a struct device with dma ops. (66 bytes, text/plain)
2009-08-19 17:46 UTC, Kyle McMartin
no flags Details
make smscore use a struct device with dma ops. (1020 bytes, patch)
2009-08-19 17:48 UTC, Kyle McMartin
no flags Details | Diff
fix dma-debug functionality for null ptr deref in unmap path (1.45 KB, patch)
2009-08-20 01:15 UTC, Kyle McMartin
no flags Details | Diff
dmesg output starting system at runlevel 3 (54.32 KB, text/plain)
2009-09-02 00:17 UTC, Rodd Clarkson
no flags Details
Xorg.0.log output after running startx from runlevel 3 (17.42 KB, text/plain)
2009-09-02 00:19 UTC, Rodd Clarkson
no flags Details

Description Rodd Clarkson 2009-07-24 01:41:16 UTC
Created attachment 354948 [details]
Trace that initially appears.  I can't scroll so I can't get any more than this.

Description of problem:

While booting the kernel panics with some output

I think this has something to do with my internal dvb tuner card.

I'll post pictures with the backtrace that appears on the display.



Version-Release number of selected component (if applicable):

This is based on the rawhide iso from this page:
 
http://www.fedoraproject.org/wiki/Test_Day:2009-07-21_Fit_and_Finish:Batteries_and_Suspend
  http://mclasen.fedorapeople.org/livecd-fedora-livecd-desktop-200907210106.iso



Additional info:

[rodd@moose ~]$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub [8086:2a40] (rev 07)
00:01.0 PCI bridge [0604]: Intel Corporation Mobile 4 Series Chipset PCI Express Graphics Port [8086:2a41] (rev 07)
00:1a.0 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 [8086:2937] (rev 03)
00:1a.1 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 [8086:2938] (rev 03)
00:1a.2 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 [8086:2939] (rev 03)
00:1a.7 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 [8086:293c] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801I (ICH9 Family) HD Audio Controller [8086:293e] (rev 03)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 [8086:2940] (rev 03)
00:1c.1 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 [8086:2942] (rev 03)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 [8086:2946] (rev 03)
00:1c.5 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 [8086:294a] (rev 03)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 [8086:2934] (rev 03)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 [8086:2935] (rev 03)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 [8086:2936] (rev 03)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 [8086:293a] (rev 03)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev 93)
00:1f.0 ISA bridge [0601]: Intel Corporation ICH9M LPC Interface Controller [8086:2919] (rev 03)
00:1f.2 SATA controller [0106]: Intel Corporation ICH9M/M-E SATA AHCI Controller [8086:2929] (rev 03)
00:1f.3 SMBus [0c05]: Intel Corporation 82801I (ICH9 Family) SMBus Controller [8086:2930] (rev 03)
01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Radeon Mobility HD 3670 [1002:9593]
01:00.1 Audio device [0403]: ATI Technologies Inc RV635 Audio device [Radeon HD 3600 Series] [1002:aa20]
04:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 5300 AGN [Shiloh] Network Connection [8086:4235]
08:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5784M Gigabit Ethernet PCIe [14e4:1698] (rev 10)
09:01.0 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 IEEE 1394 Controller [1180:0832] (rev 05)
09:01.1 SD Host controller [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter [1180:0822] (rev 22)
09:01.2 System peripheral [0880]: Ricoh Co Ltd R5C843 MMC Host Controller [1180:0843] (rev 12)
09:01.3 System peripheral [0880]: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter [1180:0592] (rev 12)
09:01.4 System peripheral [0880]: Ricoh Co Ltd xD-Picture Card Controller [1180:0852] (rev ff)

Comment 1 Rodd Clarkson 2009-07-24 01:57:08 UTC
Created attachment 354952 [details]
Backtrace that appears after booting with added pci=nomsi

I'm having to add pci=nomsi to my boot sequence in f11 to handle some suspend resume issues so I thought I'd try with this.

Comment 2 Rodd Clarkson 2009-07-24 02:00:19 UTC
Created attachment 354953 [details]
More backtrace that appears after a while

While I was taking the photos more backtrace appears.  I've seen this twice and while this one appeared after using pci=nomsi, some (similar) also appears without adding the extra parameter.

Comment 3 Rodd Clarkson 2009-07-28 04:36:20 UTC
Created attachment 355354 [details]
Since image with crash output pasted from numerous photos

Here's all the crash output in a single image.

I've used radeon.modeset=0 to get this (it seemed to improve the boot) but it still failed.

Comment 4 Rodd Clarkson 2009-08-07 23:15:28 UTC
I'm seeing the same problem with the recently supplied F-12-Alpha build

Can someone hold my hand so I can supply some useful output of this.

I've check /var/log/messages on the installed system (that won't boot) but it contains nothing.

Comment 5 Kyle McMartin 2009-08-19 04:38:10 UTC
Looks like a null pointer dereference of the kobj inside struct device passed in via dma_free_coherent in smscore_unregister_device.

Try booting with dma_debug=off.

Comment 6 Kyle McMartin 2009-08-19 05:14:12 UTC
Also, could you try the rawhide kernel scratch build here which may fix it...
http://koji.fedoraproject.org/koji/taskinfo?taskID=1613737

Comment 7 Rodd Clarkson 2009-08-19 12:08:56 UTC
booting with dma_debug=off fixes the problem.

I'm trying to download the scratch kernel, but I'm having issues, so when I get it done I'll tell you more.

Thanks for your help on this (so far) and I guess say thanks to Dave. ;-]


R.

Comment 8 Rodd Clarkson 2009-08-19 12:20:42 UTC
Okay, got the files.  Turns out that if you have connection issues while downloading these files, that firefox has big issues restarting the download.  (I had to click Tools > Clear Recent History and then clear the recent history to get the links to download the files again.)

regardless, I've downloaded them and then installed as follows (I've included this due to a issue with /etc/sysconfig/keyboard in the install):

Note: this is just install stuff, I haven't rebooted yet.

$ sudo yum install kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64.rpm kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.noarch.rpm --nogpg
Loaded plugins: refresh-packagekit
Setting up Install Process
Examining kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64.rpm: kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64
Marking kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64.rpm as an update to kernel-2.6.31-0.118.rc5.fc12.x86_64
Marking kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64.rpm as an update to kernel-2.6.31-0.125.4.2.rc5.git2.fc12.x86_64
Examining kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.noarch.rpm: kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.noarch
Marking kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.noarch.rpm as an update to kernel-firmware-2.6.31-0.125.4.2.rc5.git2.fc12.noarch
Resolving Dependencies
--> Running transaction check
---> Package kernel.x86_64 0:2.6.31-0.164.rc6.git3.bz513528.fc12 set to be installed
---> Package kernel-firmware.noarch 0:2.6.31-0.164.rc6.git3.bz513528.fc12 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package
   Arch   Version
              Repository                                                   Size
================================================================================
Installing:
 kernel
   x86_64 2.6.31-0.164.rc6.git3.bz513528.fc12
              /kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64           97 M
Updating:
 kernel-firmware
   noarch 2.6.31-0.164.rc6.git3.bz513528.fc12
              /kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.noarch 1.7 M

Transaction Summary
================================================================================
Install       1 Package(s)
Upgrade       1 Package(s)

Total size: 98 M
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Updating       : kernel-firmware-2.6.31-0.164.rc6.git3.bz513528.fc12.no   1/3 
  Installing     : kernel-2.6.31-0.164.rc6.git3.bz513528.fc12.x86_64        2/3 
/sbin/new-kernel-pkg: line 446: /etc/sysconfig/keyboard: No such file or directory
  Cleanup        : kernel-firmware-2.6.31-0.125.4.2.rc5.git2.fc12.noarch    3/3 

Installed:
  kernel.x86_64 0:2.6.31-0.164.rc6.git3.bz513528.fc12                           

Updated:
  kernel-firmware.noarch 0:2.6.31-0.164.rc6.git3.bz513528.fc12                  

Complete!

Comment 9 Rodd Clarkson 2009-08-19 12:37:54 UTC
Okay, the kernel seems to boot, but I don't get to gdm, and can't do anything (no login, not vt)

interestingly, this is the first kernel I've seen the flashy boot display with for this machine.  All the others (including f11) have just shown the bar down the bottom.

Comment 10 Kyle McMartin 2009-08-19 17:46:48 UTC
Created attachment 357961 [details]
make smscore use a struct device with dma ops.

Patch attached.

Comment 11 Kyle McMartin 2009-08-19 17:48:02 UTC
Created attachment 357962 [details]
make smscore use a struct device with dma ops.

ugh. need coffee. properly attached it now.

Comment 12 Mauro Carvalho Chehab 2009-08-19 18:28:36 UTC
Rodd,

Could you please attach the output of lsusb? I'd like to know what Siano device you have on your notebook.

Comment 13 Rodd Clarkson 2009-08-19 23:17:38 UTC
[rodd@moose ~]$ lsusb
Bus 003 Device 003: ID 413c:8157 Dell Computer Corp. 
Bus 003 Device 004: ID 413c:8158 Dell Computer Corp. 
Bus 003 Device 002: ID 0a5c:4500 Broadcom Corp. 
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 006 Device 002: ID 046d:c526 Logitech, Inc. 
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 003: ID 2040:1801 Hauppauge 
Bus 001 Device 004: ID 05ca:18a1 Ricoh Co., Ltd 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub



[rodd@moose ~]$ sudo lsusb -vv -d 2040:1801
[sudo] password for rodd: 

Bus 001 Device 003: ID 2040:1801 Hauppauge 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x2040 Hauppauge
  idProduct          0x1801 
  bcdDevice            0.01
  iManufacturer           1 Hauppauge Computer Works
  iProduct                2 WinTV-NOVA
  iSerial                 3 f05eb5ec
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)
[rodd@moose ~]$

Comment 14 Rodd Clarkson 2009-08-19 23:21:41 UTC
Yeah, I know that the word Siano doesn't appear anywhere in this.  Strange???

Also, this output is from f11.  If you want me to boot f12 and report again then ask, but f12 is a little too buggy for day to day use at the moment ;-]

Comment 15 Mauro Carvalho Chehab 2009-08-19 23:46:46 UTC
> Yeah, I know that the word Siano doesn't appear anywhere in this.  Strange???

Not really. There are some Hauppauge devices based on this chipset.
From smsusb.c:

        { USB_DEVICE(0x2040, 0x1801),
                .driver_info = SMS1XXX_BOARD_HAUPPAUGE_OKEMO_B },

From sms-cards.c:
        [SMS1XXX_BOARD_HAUPPAUGE_OKEMO_B] = {
                .name   = "Hauppauge Okemo-B",
                .type   = SMS_NOVA_B0,
                .fw[DEVICE_MODE_DVBT_BDA] = "sms1xxx-nova-b-dvbt-01.fw",
        },

I'll ask with some friends at Hauppauge for them to check if the proposed patch doesn't break the device.

Comment 16 Rodd Clarkson 2009-08-20 00:02:28 UTC
Would the same friends like to help me get it working?

I've tried a number of firmware but I can't seem to get this card to work with F11 and MythTV.  I can get it to find channels but it refuses to show liveTV.  As a result I removed the firmware, because it wasn't working and it stopped suspend/resume from working.

see: https://bugzilla.redhat.com/show_bug.cgi?id=513095

Comment 17 Kyle McMartin 2009-08-20 01:15:44 UTC
Created attachment 358015 [details]
fix dma-debug functionality for null ptr deref in unmap path

I'll do a scratch build with this patch included tonight and follow up with the koji url.

thanks. Kyle.

Comment 18 Rodd Clarkson 2009-08-20 03:45:10 UTC
I checked lsusb -vv and it's the same on f12.

Also, I dare you to try and find that firmware on the web.  I found one place and I'm not sure that it's the right firmware (as you have to rename it).

I have found dvb_nova_12mhz_b0.inp which is asked for after it fails to find the firmware above (on the Siano ftp server) but again I'm not 100% sure it's right (since I couldn't get it to watch tv.)

If your friends at hauppauge have a firmware for this device they could supply I'll happily try testing it again.

Comment 19 Rodd Clarkson 2009-08-31 11:29:03 UTC
Not sure if this is this bug, or needs to be filed as a new bug, but I haven't had any luck booting rawhide with recent kernels.

The boot process seems to go fine and then it moves to starting X, but seems to lock up.

I can boot normally with kernel-2.6.31-0.125.4.2.rc5.git2.fc12.x86_64, as long as I add the dma_debug=off.

However, newer kernels (the bz kernel above, the current kernel and another about a week ago) all fail, regardless of whether or not I use dma_debug=off.

Should I file a new bug for this, or is it related?

Comment 20 Rodd Clarkson 2009-09-02 00:15:32 UTC
Okay, more useful information.

I can boot these problematic kernels with a 3 in the boot line and get a working console.  However, starting X still fails (startx).  Also, I've discovered that I can press the power key and it will shutdown properly.

I'll attach the dmesg generated from starting at run level 3 and I'll attach the Xorg.0.log output from starting x from the commandline (as a non root user).  Neither of these files are generated when you try to boot to run level 5.

Comment 21 Rodd Clarkson 2009-09-02 00:17:56 UTC
Created attachment 359457 [details]
dmesg output starting system at runlevel 3

There is a trace in the dmesg which I've cut out here:

------------[ cut here ]------------
WARNING: at lib/dma-debug.c:798 check_unmap+0x163/0x5d6() (Not tainted)
Hardware name: Studio XPS 1640
NULL NULL: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000d00000] [size=819200 bytes]
Modules linked in: iwlcore uvcvideo snd_hwdep smsusb(+) snd_pcm mac80211 sdhci_pci videodev sdhci snd_timer ricoh_mmc mmc_core firewire_ohci snd wmi v4l1_compat sms1xxx v4l2_compat_ioctl32 firewire_core cfg80211 soundcore crc_itu_t dell_laptop iTCO_wdt rfkill i2c_i801 snd_page_alloc iTCO_vendor_support dcdbas joydev tg3 video output radeon ttm drm i2c_algo_bit i2c_core
Pid: 480, comm: modprobe Not tainted 2.6.31-0.190.rc8.fc12.x86_64 #1
Call Trace:
 [<ffffffff810653c4>] warn_slowpath_common+0x95/0xc3
 [<ffffffff8106547f>] warn_slowpath_fmt+0x50/0x66
 [<ffffffff8128eae2>] check_unmap+0x163/0x5d6
 [<ffffffff81096d1b>] ? mark_lock+0x3c/0x253
 [<ffffffff8128efd0>] debug_dma_free_coherent+0x7b/0x9d
 [<ffffffffa01c56f8>] smscore_unregister_device+0x170/0x1f7 [sms1xxx]
 [<ffffffffa027a590>] smsusb_term_device+0x45/0x96 [smsusb]
 [<ffffffffa027ad4a>] smsusb_probe+0x5b4/0x656 [smsusb]
 [<ffffffffa027a618>] ? smsusb_sendrequest+0x0/0x75 [smsusb]
 [<ffffffff813b8c87>] ? usb_autopm_do_device+0xd3/0xf3
 [<ffffffff813b9968>] usb_probe_interface+0x158/0x21f
 [<ffffffff81352dbb>] driver_probe_device+0xed/0x22a
 [<ffffffff81352f64>] __driver_attach+0x6c/0xa6
 [<ffffffff81352ef8>] ? __driver_attach+0x0/0xa6
 [<ffffffff81351f57>] bus_for_each_dev+0x68/0xb3
 [<ffffffff81352aba>] driver_attach+0x31/0x47
 [<ffffffff8135268b>] bus_add_driver+0x109/0x286
 [<ffffffff81353382>] driver_register+0xac/0x134
 [<ffffffff813b9674>] usb_register_driver+0xca/0x149
 [<ffffffffa027aef2>] ? smsusb_module_init+0x0/0x96 [smsusb]
 [<ffffffffa027af29>] smsusb_module_init+0x37/0x96 [smsusb]
 [<ffffffffa027aef2>] ? smsusb_module_init+0x0/0x96 [smsusb]
 [<ffffffff8100a0b3>] do_one_initcall+0x81/0x1b9
 [<ffffffff810a55ae>] sys_init_module+0xe7/0x239
 [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
---[ end trace 50b7e0422ccbad7c ]---

Comment 22 Rodd Clarkson 2009-09-02 00:19:36 UTC
Created attachment 359458 [details]
Xorg.0.log output after running startx from runlevel 3

X fails, but here the output before it fails.  Note that the file ends with some output about:

drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 11, (OK)

Comment 23 Rodd Clarkson 2009-09-10 10:54:35 UTC
okay, the problem I have now seems to be related to 521322

*** This bug has been marked as a duplicate of bug 521322 ***


Note You need to log in before you can comment on or make changes to this bug.