Bug 587773 - [abrt] crash in kernel: WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x220/0x260() (Tainted: P)
[abrt] crash in kernel: WARNING: at drivers/pci/msi.c:658 pci_enable_msi_bloc...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.0
i686 Linux
low Severity medium
: rc
: ---
Assigned To: Red Hat Kernel Manager
Red Hat Kernel QE team
abrt_hash:1223520226
:
: 588333 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-30 16:16 EDT by Arthur Enright
Modified: 2010-05-17 10:08 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-05-17 10:08:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
File: kerneloops (2.31 KB, text/plain)
2010-04-30 16:16 EDT, Arthur Enright
no flags Details

  None (edit)
Description Arthur Enright 2010-04-30 16:16:10 EDT
abrt 1.0.7 detected a crash.

architecture: i686
cmdline: not_applicable
comment: I logged out, logged back in using username/passwd while my CAC/Smart Card was inserted.  System reported a problem with escd stating that it was already running but the 'Smart Card Manager' applet never initialized/displayed in the gnome toolbar.
component: kernel
executable: kernel
kernel: 2.6.32-19.el6.i686
Attached file: kerneloops
package: kernel
reason: ------------[ cut here ]------------
release: Red Hat Enterprise Linux release 6.0 Beta (Santiago)

How to reproduce
-----
1. I think this had something to do with the escd smart card daemon.
2.
3.
Comment 1 Arthur Enright 2010-04-30 16:16:12 EDT
Created attachment 410590 [details]
File: kerneloops
Comment 3 RHEL Product and Program Management 2010-04-30 18:00:50 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 4 Eric Sandeen 2010-05-03 11:49:27 EDT
------------[ cut here ]------------
WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x220/0x260() (Tainted: P          )
Hardware name: OptiPlex 760                 
Modules linked in: fglrx(P)(U) fuse(U) ipt_MASQUERADE(U) iptable_nat(U) nf_nat(U) bridge(U) stp(U) llc(U) deflate(U) zlib_deflate(U) ctr(U) camellia(U) cast5(U) rmd160(U) crypto_null(U) ccm(U) serpent(U) blowfish(U) twofish(U) twofish_common(U) ecb(U) xcbc(U) cbc(U) sha256_generic(U) sha512_generic(U) des_generic(U) ah6(U) ah4(U) esp6(U) esp4(U) xfrm4_mode_beet(U) xfrm4_tunnel(U) tunnel4(U) xfrm4_mode_tunnel(U) xfrm4_mode_transport(U) xfrm6_mode_transport(U) xfrm6_mode_ro(U) xfrm6_mode_beet(U) xfrm6_mode_tunnel(U) ipcomp(U) ipcomp6(U) xfrm_ipcomp(U) xfrm6_tunnel(U) tunnel6(U) af_key(U) autofs4(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U) ip6t_REJECT(U) nf_conntrack_ipv6(U) ip6table_filter(U) ip6_tables(U) ipv6(U) ext3(U) jbd(U) aes_i586(U) aes_generic(U) xts(U) gf128mul(U) dm_crypt(U) dm_mirror(U) dm_region_hash(U) dm_log(U) uinput(U) snd_hda_codec_analog(U) snd_hda_intel(U) snd_hda_codec(U) snd_hwdep(U) snd_seq(U) snd_seq_device(U) sg(U) sr_mod(U) cdrom(U) snd_pcm(U) snd_timer
(U) ppdev(U) parport_pc(U) snd(U) parport(U) iTCO_wdt(U) dcdbas(U) iTCO_vendor_support(U) serio_raw(U) e1000e(U) soundcore(U) i2c_i801(U) snd_page_alloc(U) ext4(U) mbcache(U) jbd2(U) dm_multipath(U) sd_mod(U) crc_t10dif(U) ahci(U) ata_generic(U) pata_acpi(U) radeon(U) ttm(U) drm_kms_helper(U) drm(U) i2c_algo_bit(U) i2c_core(U) dm_mod(U) [last unloaded: scsi_wait_scan]
Pid: 12114, comm: Xorg Tainted: P           2.6.32-19.el6.i686 #1
Call Trace:
[<c044e197>] ? warn_slowpath_common+0x77/0xb0
[<c05fc750>] ? pci_enable_msi_block+0x220/0x260
[<c044e1e3>] ? warn_slowpath_null+0x13/0x20
[<c05fc750>] ? pci_enable_msi_block+0x220/0x260
[<f923a067>] ? IRQMGR_initialize+0x297/0x430 [fglrx]
[<f924d7e8>] ? firegl_trace+0x28/0x190 [fglrx]
[<f92391d3>] ? irqmgr_wrap_initialize+0x23/0xd0 [fglrx]
[<f9238a98>] ? firegl_interrupt_control+0x1b8/0x1f0 [fglrx]
[<c0583d62>] ? security_capable+0x22/0x30
[<f92388e0>] ? firegl_interrupt_control+0x0/0x1f0 [fglrx]
[<f922868d>] ? firegl_ioctl+0x22d/0x2b0 [fglrx]
[<f921dcf7>] ? ip_firegl_ioctl+0x17/0x20 [fglrx]
[<c05220a9>] ? vfs_ioctl+0x89/0xa0
[<c052229c>] ? do_vfs_ioctl+0x6c/0x5d0
[<c0522876>] ? sys_ioctl+0x76/0x90
[<c040a3fb>] ? sysenter_do_call+0x12/0x28

Given that you have just passed through a proprietary kernel module (fglrx) and now are hitting a warning in the kernel, it's likely that this will be something you need to take up with the driver vendor.
Comment 5 Adam Jackson 2010-05-03 12:01:24 EDT
Why you're using fglrx at all is a mystery.  Tsk tsk.

fglrx is trying to turn on message-signalled interrupts for the video device, but it's failing for some reason.  We may have disabled MSI for your machine; dmesg would tell you, and if so, it's probably our bug.  Or, we may not have turned MSI on for some bridge leading up to the video device; lspci -t and /sys/bus/pci/devices/*/msi_bus would tell you.

If MSI is enabled in the kernel and in the bridges leading up to the device, then the device is refusing to accept MSI setup, which means it's either an fglrx device setup bug, or an ATI hardware bug.
Comment 6 Eric Sandeen 2010-05-03 12:17:38 EDT
*** Bug 588333 has been marked as a duplicate of this bug. ***
Comment 7 Arthur Enright 2010-05-03 12:27:32 EDT
On site with a customer using AMD/ATI HW and they need the HW acceleration for medical imaging :-)

Here is a grep for MSI of the dmesg, can provide the full dmesg if it would be helpful:

hpet: hpet2 irq 24 for MSI
hpet: hpet3 irq 25 for MSI
pcieport 0000:00:01.0: irq 29 for MSI/MSI-X
pcieport 0000:00:1c.0: irq 30 for MSI/MSI-X
pcieport 0000:00:1c.1: irq 31 for MSI/MSI-X
radeon 0000:01:00.0: irq 32 for MSI/MSI-X
[drm] radeon: using MSI.
ahci 0000:00:1f.2: irq 33 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
HDA Intel 0000:00:1b.0: irq 35 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x220/0x260() (Tainted: P          )
 [<c05fc750>] ? pci_enable_msi_block+0x220/0x260
 [<c05fc750>] ? pci_enable_msi_block+0x220/0x260
radeon 0000:01:00.0: irq 36 for MSI/MSI-X
radeon 0000:01:00.0: irq 37 for MSI/MSI-X


lspci -t:

-[0000:00]-+-00.0
           +-01.0-[01]----00.0
           +-03.0
           +-03.2
           +-03.3
           +-19.0
           +-1a.0
           +-1a.1
           +-1a.2
           +-1a.7
           +-1b.0
           +-1c.0-[02]--
           +-1c.1-[03]--
           +-1d.0
           +-1d.1
           +-1d.2
           +-1d.7
           +-1e.0-[04]--
           +-1f.0
           +-1f.2
           \-1f.3

# cat /sys/bus/pci/devices/0000\:01\:00.0/msi_bus 

# grep 1 /sys/bus/pci/devices/*/msi_bus
/sys/bus/pci/devices/0000:00:01.0/msi_bus:1
/sys/bus/pci/devices/0000:00:1c.0/msi_bus:1
/sys/bus/pci/devices/0000:00:1c.1/msi_bus:1
/sys/bus/pci/devices/0000:00:1e.0/msi_bus:1

Looks like MSI isn't on for the device (unless I misread the deviceID.), however it is enabled for some of the other devices.

What do you think?
Comment 8 Adam Jackson 2010-05-03 13:20:40 EDT
You seem to have both radeon and fglrx loaded.  That's not likely to work well; honestly I'm shocked it works at all.  Blacklist it in /etc/modprobe.d, run /usr/libexec/plymouth/plymouth-update-initrd, and reboot.

(Actually, I think I misread this the first time around.  The warning code is:

        WARN_ON(!!dev->msi_enabled);

which means "warn if it's _already_ enabled".  Which it probably would be, since radeon would have already loaded and set it up before fglrx came along.)
Comment 9 Arthur Enright 2010-05-03 13:26:17 EDT
On site with a customer using AMD/ATI HW and they need the HW acceleration for medical imaging :-)

Here is a grep for MSI of the dmesg, can provide the full dmesg if it would be helpful:

hpet: hpet2 irq 24 for MSI
hpet: hpet3 irq 25 for MSI
pcieport 0000:00:01.0: irq 29 for MSI/MSI-X
pcieport 0000:00:1c.0: irq 30 for MSI/MSI-X
pcieport 0000:00:1c.1: irq 31 for MSI/MSI-X
radeon 0000:01:00.0: irq 32 for MSI/MSI-X
[drm] radeon: using MSI.
ahci 0000:00:1f.2: irq 33 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
HDA Intel 0000:00:1b.0: irq 35 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x220/0x260() (Tainted: P          )
 [<c05fc750>] ? pci_enable_msi_block+0x220/0x260
 [<c05fc750>] ? pci_enable_msi_block+0x220/0x260
radeon 0000:01:00.0: irq 36 for MSI/MSI-X
radeon 0000:01:00.0: irq 37 for MSI/MSI-X


lspci -t:

-[0000:00]-+-00.0
           +-01.0-[01]----00.0
           +-03.0
           +-03.2
           +-03.3
           +-19.0
           +-1a.0
           +-1a.1
           +-1a.2
           +-1a.7
           +-1b.0
           +-1c.0-[02]--
           +-1c.1-[03]--
           +-1d.0
           +-1d.1
           +-1d.2
           +-1d.7
           +-1e.0-[04]--
           +-1f.0
           +-1f.2
           \-1f.3

# cat /sys/bus/pci/devices/0000\:01\:00.0/msi_bus 

# grep 1 /sys/bus/pci/devices/*/msi_bus
/sys/bus/pci/devices/0000:00:01.0/msi_bus:1
/sys/bus/pci/devices/0000:00:1c.0/msi_bus:1
/sys/bus/pci/devices/0000:00:1c.1/msi_bus:1
/sys/bus/pci/devices/0000:00:1e.0/msi_bus:1

Looks like MSI isn't on for the device (unless I misread the deviceID.), however it is enabled for some of the other devices.

What do you think?
Comment 10 Arthur Enright 2010-05-03 13:27:07 EDT
Wow, not sure how I re-submitted that.  You read my mind on the blacklist.  I'm blacklisting the radeon driver and rebooting now.

Thanks!
Comment 11 Arthur Enright 2010-05-03 13:49:46 EDT
The AMD/ATI installer did create an /etc/modprobe.d/blacklist-fglrx.conf file:

# Advanced Micro Devices, Inc.
# radeon conflicts with AMD Linux Graphics Driver
blacklist radeon

but it seems like the missing step was rebuilding initrd.  No more kerne crash with the rebuilt initrd.  I will submit something to AMD requesting that the rebuild be part of the installer.

Ironically, when both the radeon and fglrx drivers were loading, there was the benefit of the pretty boot splash and X was able to use the HW acceleration features of the fglrx driver, it's a pity they don't play nice.

Thanks for all your help!
-Art

Note You need to log in before you can comment on or make changes to this bug.