Bug 490477 - [Intel IOMMU] Using isochronous DMAR unit on ICH10 board causes _other_ DMAR unit to stop working.
[Intel IOMMU] Using isochronous DMAR unit on ICH10 board causes _other_ DMAR ...
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
10
All Linux
low Severity medium
: ---
: ---
Assigned To: David Woodhouse
Fedora Extras Quality Assurance
: Reopened
: 499614 (view as bug list)
Depends On:
Blocks: F11VirtTarget 499352
  Show dependency treegraph
 
Reported: 2009-03-16 12:25 EDT by drago01
Modified: 2009-11-07 18:59 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-11-07 18:59:37 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Image showing the error messages on boot (2.42 MB, image/png)
2009-03-16 12:25 EDT, drago01
no flags Details
dmidecode output (22.97 KB, text/plain)
2009-03-16 12:26 EDT, drago01
no flags Details
smaller version of error message image (305.42 KB, image/jpeg)
2009-03-26 00:30 EDT, Chuck Ebbert
no flags Details
potential workaround for broken bios (1.50 KB, patch)
2009-05-07 09:19 EDT, David Woodhouse
no flags Details | Diff
Logfile from the console (49.93 KB, text/plain)
2009-05-14 04:23 EDT, Rainer Koenig
no flags Details
Quirk to use passthrough mode for sound (1.75 KB, patch)
2009-08-04 12:10 EDT, David Woodhouse
no flags Details | Diff

  None (edit)
Description drago01 2009-03-16 12:25:54 EDT
Created attachment 335366 [details]
Image showing the error messages on boot

Description of problem:

When I enable VT-d in the BIOS and boot with intel_iommu=on the system freezes with a kernel panic. (attached)

Version-Release number of selected component (if applicable):

2.6.29-0.53.rc7.fc10.x86_64

How reproducible:

Always

Steps to Reproduce:
1. Enable VT-d
2. boot with intel_iommu=on
  
Actual results:

Kernel panic

Expected results:

No kernel panic

Additional info:

00:00.0 Host bridge [0600]: Intel Corporation X58 I/O Hub to ESI Port [8086:3405] (rev 12)
00:01.0 PCI bridge [0604]: Intel Corporation X58 I/O Hub PCI Express Root Port 1 [8086:3408] (rev 12)
00:03.0 PCI bridge [0604]: Intel Corporation X58 I/O Hub PCI Express Root Port 3 [8086:340a] (rev 12)
00:07.0 PCI bridge [0604]: Intel Corporation X58 I/O Hub PCI Express Root Port 7 [8086:340e] (rev 12)
00:10.0 PIC [0800]: Intel Corporation X58 Physical and Link Layer Registers Port 0 [8086:3425] (rev 12)
00:10.1 PIC [0800]: Intel Corporation X58 Routing and Protocol Layer Registers Port 0 [8086:3426] (rev 12)
00:13.0 PIC [0800]: Intel Corporation X58 I/O Hub I/OxAPIC Interrupt Controller [8086:342d] (rev 12)
00:14.0 PIC [0800]: Intel Corporation X58 I/O Hub System Management Registers [8086:342e] (rev 12)
00:14.1 PIC [0800]: Intel Corporation X58 I/O Hub GPIO and Scratch Pad Registers [8086:3422] (rev 12)
00:14.2 PIC [0800]: Intel Corporation X58 I/O Hub Control Status and RAS Registers [8086:3423] (rev 12)
00:14.3 PIC [0800]: Intel Corporation X58 I/O Hub Throttle Registers [8086:3438] (rev 12)
00:1a.0 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 [8086:3a37]
00:1a.1 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 [8086:3a38]
00:1a.2 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 [8086:3a39]
00:1a.7 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 [8086:3a3c]
00:1b.0 Audio device [0403]: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller [8086:3a3e]
00:1c.0 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1 [8086:3a40]
00:1c.2 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 3 [8086:3a44]
00:1c.5 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6 [8086:3a4a]
00:1d.0 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 [8086:3a34]
00:1d.1 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 [8086:3a35]
00:1d.2 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 [8086:3a36]
00:1d.7 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 [8086:3a3a]
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev 90)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller [8086:3a16]
00:1f.2 SATA controller [0106]: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller [8086:3a22]
00:1f.3 SMBus [0c05]: Intel Corporation 82801JI (ICH10 Family) SMBus Controller [8086:3a30]
02:00.0 VGA compatible controller [0300]: nVidia Corporation Device [10de:05e3] (rev a1)
04:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller [11ab:4364] (rev 12)
05:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller [11ab:4364] (rev 12)
07:01.0 Network controller [0280]: Atheros Communications Inc. AR5008 Wireless Network Adapter [168c:0023] (rev 01)
07:02.0 FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306 Fire II IEEE 1394 OHCI Link Layer Controller [1106:3044] (rev c0)
Comment 1 drago01 2009-03-16 12:26:33 EDT
Created attachment 335367 [details]
dmidecode output
Comment 2 David Woodhouse 2009-03-16 12:29:15 EDT
Does it work if you disable legacy USB keyboard/mouse emulation in the BIOS?
Comment 3 drago01 2009-03-16 12:49:52 EDT
(In reply to comment #2)
> Does it work if you disable legacy USB keyboard/mouse emulation in the BIOS?  

Yes the kernel boots this way but then the system hangs at starting udev.
Comment 4 David Woodhouse 2009-03-19 09:00:09 EDT
A BIOS bug then. I suspect the hang at starting udev is going to be something different?
Comment 5 Chuck Ebbert 2009-03-26 00:30:34 EDT
Created attachment 336751 [details]
smaller version of error message image
Comment 6 Chuck Ebbert 2009-04-13 14:34:25 EDT
Fixes that went into Fedora 11 are now in 2.6.29.1-19 and later kernels.
Comment 7 drago01 2009-04-13 15:37:36 EDT
(In reply to comment #4)
> A BIOS bug then. I suspect the hang at starting udev is going to be something
> different?  

Updated to BIOS 0302 but it did not fix it (well changelog says "new cpu support")

(In reply to comment #6)
> Fixes that went into Fedora 11 are now in 2.6.29.1-19 and later kernels. 

Will test when the build show up in koji ;)
Comment 8 drago01 2009-04-14 14:02:19 EDT
(In reply to comment #7)
> (In reply to comment #4)
> > A BIOS bug then. I suspect the hang at starting udev is going to be something
> > different?  
> 
> Updated to BIOS 0302 but it did not fix it (well changelog says "new cpu
> support")
> 
> (In reply to comment #6)
> > Fixes that went into Fedora 11 are now in 2.6.29.1-19 and later kernels. 
> 
> Will test when the build show up in koji ;)  

Still same/similar results.
Sometimes it hangs when trying to set up the hpet. (booting with nohpet results into always hanging at the same place, see screenshot attached before)
Comment 9 David Woodhouse 2009-05-07 06:54:25 EDT
The hang at boot is going to be a BIOS bug. The BIOS is using DMA for legacy keyboard/mouse emulation and doesn't tell us about it. So its DMA gets stopped (those are the faults you see in the photo). And then it hangs up in SMM mode eventually, because BIOSes are written by monkeys.

I'm more interested in the 'hang when starting udev' that you reported, after you turned off the legacy keyboard/mouse emulation. Did you ever get any more information about that?
Comment 10 Mark McLoughlin 2009-05-07 06:57:38 EDT
How about we close this as NOTABUG/WONTFIX then and file the udev bug separately?
Comment 11 David Woodhouse 2009-05-07 07:09:17 EDT
That makes sense.
Comment 12 David Woodhouse 2009-05-07 09:17:10 EDT
*** Bug 499614 has been marked as a duplicate of this bug. ***
Comment 13 David Woodhouse 2009-05-07 09:19:43 EDT
Created attachment 342840 [details]
potential workaround for broken bios

Hm, perhaps we could work around this with something like the attached patch
Comment 14 Rainer Koenig 2009-05-14 04:21:41 EDT
I tried to fix this problem with the potential fix attached to this bug, but it didn't work out. The system still hangs at "starting udev" for a while and then produces a lot of errors and then gets into a loop of endlessly reporting

"/etc/rc.d/rc.sysinit: line 824: /bin/usleep: Input/output error"

I'm going to attach my captured log to this bug.
Comment 15 Rainer Koenig 2009-05-14 04:23:08 EDT
Created attachment 343924 [details]
Logfile from the console

Crash produced with a patched kernel.
Comment 16 David Woodhouse 2009-05-14 06:13:10 EDT
Rainer, the patch would do nothing for you -- it just disables the IOMMU for Need Real Name's machine based on the DMI identification.

If you have turned off the legacy keyboard/mouse emulation, you've worked around the BIOS bug which was preventing the machine from booting. Now you're left with an entirely more interesting problem, which may well be a Linux bug.

Please can you boot with 'mem=2G' and see if you can still reproduce the problem when you don't have any memory above physical address 0x100000000 ?
Comment 17 Rainer Koenig 2009-05-14 07:13:50 EDT
Hi David,

yes you're right. I just applied the patch without looking at it. :-) But if it just disables the IOMMU then its not helping since the goal is to get the IOMMU working. :-)

I tried out with mem=2G and had the same problem on that machine. But I still believe that its somewhat BIOS related. Maybe we should reopen

https://bugzilla.redhat.com/show_bug.cgi?id=499614 

since it looks like that this but is no longer been considered to be a duplicate of that bug. :-)

Regards
Rainer
Comment 18 David Woodhouse 2009-05-14 08:48:53 EDT
I believe the two systems are behaving similarly. First there's a BIOS bug which makes it crash if you enable legacy keyboard/mouse emulation. That's the bug which was originally reported, and it's unfixable except by promoting an attitude of violence towards BIOS engineers. So strictly speaking the bugs should be closed.

We're actually now discussing a _separate_ bug, which is the 'long hang in udev' which both of you seem to be seeing, although one of you (Rainer) has now reported that if you wait long enough (how long?), it eventually finishes and you see other interesting stuff.

Need-Real, can you reproduce that observation? How long did you leave it at 'starting udev' before you gave up and reset it?
Comment 19 drago01 2009-05-14 09:01:58 EDT
(In reply to comment #18)
> I believe the two systems are behaving similarly. First there's a BIOS bug
> which makes it crash if you enable legacy keyboard/mouse emulation. That's the
> bug which was originally reported, and it's unfixable except by promoting an
> attitude of violence towards BIOS engineers. So strictly speaking the bugs
> should be closed.
> 
> We're actually now discussing a _separate_ bug, which is the 'long hang in
> udev' which both of you seem to be seeing, although one of you (Rainer) has now
> reported that if you wait long enough (how long?), it eventually finishes and
> you see other interesting stuff.
> 
> Need-Real, can you reproduce that observation? How long did you leave it at
> 'starting udev' before you gave up and reset it?  

So tested this again it.
I waited ~5 min and nothing happened (seems to hang forever). While udev was trying to start the HDD led was constantly "on" and the (USB) keyboard was dead (could not even toggle the numlock).
Comment 21 drago01 2009-05-14 10:46:18 EDT
(In reply to comment #20)
> http://www.scan.co.uk/Products/Asus-P6T-Deluxe-V2-Intel-X58-S1366-PCI-E-20-%28x16%29-Triple-DDR3-2000%28OC%29-1866%28OC%29-1600%28OC%29-ATX
> 
> Need-Real, you have one of these?  

Yes that's the board that I am using and have issues with.
Comment 22 Don Dutile 2009-05-14 10:57:39 EDT
Try blacklisting the wireless devices.
I've seen problems w/my wireless device (that other wired-nic's don't);
applying various upstream (wireless driver) patches stopped the kernel
crashes I saw on my Dell laptop, but it would leave it hanging
when the wireless went to do dhcp (but phase that connects
to wireless router worked).
Comment 23 Rainer Koenig 2009-05-15 03:07:38 EDT
(In reply to comment #18)
> I believe the two systems are behaving similarly. First there's a BIOS bug
> which makes it crash if you enable legacy keyboard/mouse emulation. That's the
> bug which was originally reported, and it's unfixable except by promoting an
> attitude of violence towards BIOS engineers. So strictly speaking the bugs
> should be closed.

I doubt that you can solve a techical problem by adding a social layer to it. :-)

Since the kernel reports 

DMAR:Unknown DMAR structure type

in my case I'm going to discuss this with BIOS development. There must be a technical reason *why* this message is there and digging into this should help somehow. A colleauge also offered to run a VT-d test for Windows on that machine so that we can verify the integrity of the data provided by the BIOS. 

Probably also BIOS deveolpment should have some data verification tools from Intel. Unfortunately this bug is not the one with the highest priority, but I keep trying to get a root cause for the misbehaviour. 

Regards
Rainer
Comment 24 David Woodhouse 2009-05-15 05:13:30 EDT
The unknown structure is probably ATSR and entirely harmless -- it's not going to be related to the problem. Running the VT-d test suite would be useful.
Comment 25 David Woodhouse 2009-05-15 05:15:24 EDT
(In reply to comment #22)
> Try blacklisting the wireless devices.
> I've seen problems w/my wireless device (that other wired-nic's don't);
> applying various upstream (wireless driver) patches stopped the kernel
> crashes I saw on my Dell laptop, but it would leave it hanging
> when the wireless went to do dhcp (but phase that connects
> to wireless router worked).  

Can you file a separate bug for that please? And if it happens only with the IOMMU enabled, please CC me.
Comment 26 drago01 2009-06-02 15:00:16 EDT
Forgot to add this information here:

While trying to debug the udev hang it turned out that the harddisk dma goes AWOL which results into a lot of input/output errors.
Comment 27 David Woodhouse 2009-06-27 09:12:23 EDT
(In reply to comment #26)
> Forgot to add this information here:
> 
> While trying to debug the udev hang it turned out that the harddisk dma goes
> AWOL which results into a lot of input/output errors.  


That's kind of what I suspected -- but _when_ does it happen? It doesn't happen when we first enable the IOMMU; the disk is still working after that.

Can you try the 2.6.31-rc1 kernel with 'iommu=pt' ?
Comment 28 David Woodhouse 2009-07-29 12:27:39 EDT
I have one of these now, and have reproduced the problem. It works fine with the IOMMU on (as long as Legacy USB support is disabled in the BIOS), and the hang "in udev" is caused by loading the snd-hda-intel driver. Blacklisting that makes it work fine. Will investigate further...
Comment 29 David Woodhouse 2009-07-29 14:39:52 EDT
Pressing the reset button after the crash doesn't seem to work -- I have to power cycle the box. The BIOS claims to have a separate IOMMU just for the sound device, and I think it's telling the truth -- if I hack the code to let the sound fall under the catch-all device, then sound DMA fails with a 'Present bit in root entry is clear' fault as we would expect.

So the problem seems to happen when we set up and use a DMA mapping on the second IOMMU unit. I'll see if I can get help from some chipset folks.
Comment 30 David Woodhouse 2009-07-29 14:47:14 EDT
Rainer, is your problem also avoided by blacklisting the snd-hda-intel driver?
Comment 31 David Woodhouse 2009-07-29 15:27:35 EDT
If I boot with 'iommu=pt', then it works fine (I didn't try actual sound output, but it probes the codec OK and the system doesn't die).

If I hack the code to not use hardware pass-through, but instead use the software 1:1 mapping, it fails. I'm beginning to suspect a hardware problem...
Comment 32 drago01 2009-07-29 15:33:40 EDT
(In reply to comment #31)
> If I boot with 'iommu=pt', then it works fine (I didn't try actual sound
> output, but it probes the codec OK and the system doesn't die).
> 
> If I hack the code to not use hardware pass-through, but instead use the
> software 1:1 mapping, it fails. I'm beginning to suspect a hardware problem...  

What kind of hardware problem exactly?
Comment 33 David Woodhouse 2009-07-29 16:02:35 EDT
I don't know. But so far I've observed that: 

 - Hardware pass-through on the sound IOMMU works fine.
 - Software pass-through (where we set up a 1:1 mapping in the page tables for
   all of memory as reported by E820) fails as you described, which is that
   _ALL_ DMA in the system, even that on the other IOMMU, stops dead.

There's very little scope for a software bug to be causing this -- or even a BIOS bug AFAICT. Unless the BIOS is describing the hardware wrongly in some way such that it _almost_ works...
Comment 34 drago01 2009-07-29 16:08:26 EDT
(In reply to comment #33)
> I don't know. But so far I've observed that: 
> 
>  - Hardware pass-through on the sound IOMMU works fine.
>  - Software pass-through (where we set up a 1:1 mapping in the page tables for
>    all of memory as reported by E820) fails as you described, which is that
>    _ALL_ DMA in the system, even that on the other IOMMU, stops dead.
> 
> There's very little scope for a software bug to be causing this -- or even a
> BIOS bug AFAICT. Unless the BIOS is describing the hardware wrongly in some way
> such that it _almost_ works...  

OK, is it possible for you (with your Intel Hat on) to get some contact to ASUS to get any information from them about this?

My contact attempts went straight to /dev/null
Comment 35 David Woodhouse 2009-07-29 16:43:25 EDT
I think we can probably manage that. I'm talking to the chipset folks right now, and if we decide it's a board layout or BIOS issue then I think they have some way of making contact.
Comment 36 David Woodhouse 2009-07-31 12:22:13 EDT
I tried hacking the code to treat sound devices like we do graphics devices when booted with 'intel_iommu=igfx_off' -- i.e. ignore the dedicated IOMMU unit completely, and do no mapping for the PCI device in question.

The board locks up at boot time, when enabling the _other_ IOMMU.
Comment 37 David Woodhouse 2009-08-04 12:10:49 EDT
Created attachment 356205 [details]
Quirk to use passthrough mode for sound

This patch, which applies on top of commit 19943b0e at git://git.infradead.org/iommu-2.6.git, should work around the problem by using hardware passthrough mode for the sound device.
Comment 38 drago01 2009-08-10 12:20:54 EDT
OK, I have built your tree with the patch applied on top of it and it seems to work fine (with usb legacy off).

System booted and I am currently writing this comment while running it with IOMMU on ;)

Now we need some kind of workaround for the USB issue (if it is possible to do anything about it in the kernel at all).
Comment 39 David Woodhouse 2009-08-14 09:06:08 EDT
The USB issue is worked around by the patch at https://lists.linux-foundation.org/pipermail/iommu/2009-August/001657.html
Comment 40 drago01 2009-08-14 09:23:26 EDT
(In reply to comment #39)
> The USB issue is worked around by the patch at
> https://lists.linux-foundation.org/pipermail/iommu/2009-August/001657.html  

Nice, thanks for that, this means that your current iommu tree should "just work" on this box, right? (As Greg has acked the patch I assume it is in your tree already).

Anyway going to test this later today or tomorrow.
Comment 41 Michael J Coss 2009-08-20 07:59:17 EDT
I have a similar board ASUS p6t7 ws and I'm seeing pretty much the same problems.  I'm tryiing to enable the Vt-d support, and as above, to get the system to boot I must disable legacy USB, and then the system hangs in waiting for uevents.  I tried the iommu=pt, but that doesn't help, but disabling the sound device in the BIOS gets the system booting, so clearly it's some interaction with the intel snd_hda device that is triggering the problem.  I'm running on a 2.6.30-r5 kernel, gentoo distro.  Thanks for the work so far, as you've help me get further along.  Now if I can just integrate the patches, maybe I can get to the point of actually testing Vt-d on this board.
Comment 42 Michael J Coss 2009-08-21 05:56:34 EDT
Just some additional data.  I've gotten and built the 2.6.31-rc6 kernel, rebuilt the Nvidia driver, and booted with Vt-d enabled on the board.  As before I disabled legacy USB, and the intel snd-hda device in the BIOS.  I get the following errors which could either be a bug in the Nvidia driver or an issue with DMAR

DRHD: handling fault status reg 202
DMAR:[DMA READ] Request device [07:00.0] fault addr 0
DMAR[fault reason 06] PTE Read access not set

< repeats with different status registers, over and over again, as long as Xserver is active >

Now the graphics card is still working, and 3d acceleration is working as well (i.e. glxgears runs with an appropriate frame rate) albeit slowed by the printk's but obviously somethings wrong.

If I can help test any patches or updates just let me know.
Comment 43 Mark McLoughlin 2009-09-22 09:28:18 EDT
Hmm, it's not clear to me what the current status is with this one - e.g. it's filed against Fedora 10, but people are talking about 2.6.30 and 2.6.31. Anyone care to summarise?
Comment 44 Adel Gadllah 2009-09-22 10:19:10 EDT
(In reply to comment #43)
> Hmm, it's not clear to me what the current status is with this one - e.g. it's
> filed against Fedora 10, but people are talking about 2.6.30 and 2.6.31. Anyone
> care to summarise?  

It initially happened on F10, but fixing it was a longer process the fixes/workarounds should be in David's IOMMU tree know, dunno if all of them made it into .31.

David?
Comment 45 David Woodhouse 2009-09-22 17:25:08 EDT
The Asus isoch bug (all DMA goes south when you load the sound driver) has been root-caused in the lab this week, and I'm putting a patch together now.

There are some other patches which went into .31, some more about to go into .32.
Comment 46 David Woodhouse 2009-09-22 17:38:59 EDT
Oh, and the root cause is a BIOS bug. Again.

Yay for closed-source BIOSes.

The workaround for the other BIOS bug which killed this Asus board is queued for 2.6.32, but I've already put it into the rawhide kernel:
https://lists.linux-foundation.org/pipermail/iommu/2009-August/001657.html
Comment 47 David Woodhouse 2009-09-30 12:21:27 EDT
OK, sorry for the delay. This should fix it by bypassing the affected IOMMU unit, just for the sound device:
http://git.infradead.org/iommu-2.6.git/commitdiff/e0fc7e0b4

We've _also_ poked Asus and they have a new BIOS available which fixes the problem.
Comment 48 Adam Williamson 2009-11-07 14:32:28 EST
*** Bug 522668 has been marked as a duplicate of this bug. ***
Comment 49 Adam Williamson 2009-11-07 14:33:03 EST
Rahul nominated one of the dupes as f12blocker, so nominating this instead. See all the dupes on 522668: we have a lot of people running into this :(

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 50 Adam Williamson 2009-11-07 14:47:00 EST
so, here's the scoop on this.

it breaks USB functionality entirely, and in one case at least completely stops the system booting, on several Intel chipset-based motherboards. (It's actually really caused by buggy BIOSes, but we can't sell that to the users). We have at least five reports so far, there's probably more I didn't catch and dupe yet.

If we ship with this, we will have people with unusable systems. The workaround is simple: intel_iommu=off kernel parameter. But it relies on them finding the documentation.

If we disable it by default, the impact is that it breaks PCI passthrough for KVMs. Kyle is almost positive it can't possibly break anything else. The converse workaround would be possible for any virt users who need that to work: intel_iommu=1 .

This is fairly sucky, and late to catch. Honestly it's something I'd like to fix. The above is a summary of what we know so far, I'll do some more research later today (inc. looking for more dupes and check what other distros have done), and possibly poke Jesse's cell.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 51 Adam Williamson 2009-11-07 16:01:17 EST
also note #530340: it's not strictly a dupe of this issue, but it would be addressed by the fix Kyle would put in place for this issue if we decide to fix it. So that's another report to consider.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 52 Adam Williamson 2009-11-07 16:09:50 EST
also see 522201 for another system known to throw its toys out of the pram with intel_iommu=1 - smooge, can you check if that system fails with f12's default kernel config?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 53 Adam Williamson 2009-11-07 16:13:20 EST
also 495603 - from an earlier kernel when the reporter caught the "DMAR reported at address zero!" message when it was just a warning, I bet that one would fail with f12 kernel. Gustavo, if you read this soon, could you check?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 54 Mariusz Smykuła 2009-11-07 16:15:05 EST
What is diffrence between intel_iommu=off and iommu=soft. 
iommu=soft works for me great, but when I used intel_iommu=off after some time my wifi was dead and router need restart (maybe this is ruter problem, not iommu, im not sure). In messages I found:

Nov  7 22:00:19 localhost NetworkManager: <info>  Activation (wlan0) Stage 5 of 5 (IP Configure Commit) complete.
Nov  7 22:00:19 localhost ntpd[1557]: Listening on interface #5 wlan0, 192.168.1.100#123 Enabled
Nov  7 22:01:44 localhost abrtd: Hmm, stray update_client: 'Tworzenie raportów awarii "kernel oops"'
Nov  7 22:01:44 localhost abrt: Kerneloops: Reported 1 kernel oopses to Abrt
Nov  7 22:01:44 localhost abrtd: Directory 'kerneloops-1257627704-1' creation detected
Nov  7 22:01:44 localhost abrtd: Uzyskiwanie lokalnego uniwersalnego, unikalnego identyfikatora
Nov  7 22:01:44 localhost abrtd: New crash, saving...
Nov  7 22:04:26 localhost NetworkManager: <info>  (wlan0): device state change: 8 -> 3 (reason 39)
Nov  7 22:04:26 localhost NetworkManager: <info>  (wlan0): deactivating device (reason: 39).

With default kernel option usb is broken but WIFI and abrtd is OK.
Comment 55 Adam Williamson 2009-11-07 16:27:39 EST
524808 also is related to this topic and contains several people reporting the common 'USB doesn't work' manifestation, don't know if kernel devs would consider it a dupe or not.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 56 Adam Williamson 2009-11-07 16:28:34 EST
mariusz: I believe intel_iommu=off disables it entirely but iommu=soft falls back to a software implementation. Chuck says that iommu=soft is the preferred workaround, which matches your case.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 57 Adam Williamson 2009-11-07 18:43:51 EST
David Woodhouse says the correct master bug for the 'my USB no worky' issues is in fact 524808. Will shift the dupe notifications and the F12Blocker status to that bug. Please continue discussion there. Sorry for the mix-up.

He says this bug should be closed, but I'll leave that to him.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 58 David Woodhouse 2009-11-07 18:59:37 EST
This bug, which causes a machine to crash when the sound driver is loaded, is fixed.

Note You need to log in before you can comment on or make changes to this bug.