Bug 998732 - [abrt] kernel BUG at drivers/iommu/intel-iommu.c:785!
Summary: [abrt] kernel BUG at drivers/iommu/intel-iommu.c:785!
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:8d7b233cf713c12e06882104eae...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-19 23:31 UTC by Julian Stecklina
Modified: 2013-10-18 19:31 UTC (History)
6 users (show)

Fixed In Version: kernel-3.11.4-101.fc18
Clone Of:
Environment:
Last Closed: 2013-10-13 19:55:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: dmesg (77.54 KB, text/plain)
2013-08-19 23:32 UTC, Julian Stecklina
no flags Details
Fix pfn_to_dma_pte to return NULL instead of tripping over BUG_ON. (535 bytes, patch)
2013-08-20 13:22 UTC, Julian Stecklina
no flags Details | Diff

Description Julian Stecklina 2013-08-19 23:31:54 UTC
Description of problem:
This bug happened while using a userspace driver for an Intel 82599 (ixgbe) NIC. The program uses VFIO. The bug happens when trying to map a large memory region with the VFIO_IOMMU_MAP_DMA ioctl. This bug is 100% reproducible. Parameters to the ioctl call are given below:

IOCTL VFIO_IOMMU_MAP_DMA flags 0x3 vaddr 7f74e8000000 iova 7f74e8000000 size 8000000

I'm happy to provide further info.

Additional info:
reporter:       libreport-2.1.6
kernel BUG at drivers/iommu/intel-iommu.c:785!
invalid opcode: 0000 [#1] SMP 
Modules linked in: vfio_pci vfio_iommu_type1 vfio nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mperf coretemp iTCO_wdt iTCO_vendor_support kvm_intel kvm crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm microcode i2c_i801 snd_page_alloc ixgbe e1000e snd_timer lpc_ich mfd_core snd ptp ib_mthca mdio dca soundcore mei_me pps_core mei ib_mad ib_core nfsd auth_rpcgss nfs_acl lockd sunrpc uinput binfmt_misc xfs libcrc32c i915 i2c_algo_bit drm_kms_helper firewire_ohci drm firewire_core crc_itu_t i2c_core video
CPU: 4 PID: 1827 Comm: sv3 Not tainted 3.10.7-200.fc19.x86_64 #1
Hardware name:                  /DQ77MK, BIOS MKQ7710H.86A.0034.2012.0320.2026 03/20/2012
task: ffff8804053b53e0 ti: ffff8803e1882000 task.ti: ffff8803e1882000
RIP: 0010:[<ffffffff8150f558>]  [<ffffffff8150f558>] pfn_to_dma_pte+0x228/0x230
RSP: 0018:ffff8803e1883d10  EFLAGS: 00010202
RAX: 00000000000000fe RBX: 0000000000000003 RCX: 000000000000001b
RDX: ffff88033bc4d000 RSI: 00000007f1ae4000 RDI: ffff88033e7dad80
RBP: ffff8803e1883d50 R08: 0000000000000009 R09: 0000000000000000
R10: 0000000000003b71 R11: 0000000000008000 R12: 00007f1ae4000000
R13: 00000007f1ae4000 R14: ffff88033e7dad80 R15: 0000000000008000
FS:  00007f1afd1bf700(0000) GS:ffff88041e300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000003f47874be0 CR3: 00000003ede2d000 CR4: 00000000001427e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 0000000000000004 ffff880404e50dc0 0000000000000246 00007f1ae4000000
 00007f1ae4000000 ffff8803b618a960 0000000000000000 0000000000008000
 ffff8803e1883d60 ffffffff8150fa38 ffff8803e1883d70 ffffffff81505257
Call Trace:
 [<ffffffff8150fa38>] intel_iommu_iova_to_phys+0x18/0x40
 [<ffffffff81505257>] iommu_iova_to_phys+0x17/0x30
 [<ffffffffa048e4dc>] __vfio_dma_map+0x5c/0x270 [vfio_iommu_type1]
 [<ffffffffa048eebd>] vfio_iommu_type1_ioctl+0x53d/0x764 [vfio_iommu_type1]
 [<ffffffffa04844e7>] vfio_fops_unl_ioctl+0x77/0x340 [vfio]
 [<ffffffff811a96d5>] do_vfs_ioctl+0x305/0x520
 [<ffffffff81197ab0>] ? vfs_write+0x160/0x1e0
 [<ffffffff811a9971>] SyS_ioctl+0x81/0xa0
 [<ffffffff81647719>] system_call_fastpath+0x16/0x1b
Code: 44 8b 4d d0 e9 68 fe ff ff 48 83 c4 18 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 f0 48 d3 e8 48 85 c0 0f 84 1b fe ff ff <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 49 89 cb 49 89 f2 48 
RIP  [<ffffffff8150f558>] pfn_to_dma_pte+0x228/0x230
 RSP <ffff8803e1883d10>

Comment 1 Julian Stecklina 2013-08-19 23:32:01 UTC
Created attachment 788211 [details]
File: dmesg

Comment 2 Julian Stecklina 2013-08-20 09:38:10 UTC
This also happens for page size mappings. I think it is the first mapping of DMA memory above 4G in my program.

Comment 3 Julian Stecklina 2013-08-20 13:13:19 UTC
Ok, I checked the capabilities register of the IOMMU and the MGAW (maximum guest address width) seems to be 38. Thus VFIO asks the IOMMU about addresses the latter cannot handle, which is probably what the BUG_ON tests for. This is bad, because the user can control the addresses and force the kernel to stumble over a BUG_ON.

In intel_iommu_map() this case is handled explicitly.

Comment 4 Julian Stecklina 2013-08-20 13:22:31 UTC
Created attachment 788499 [details]
Fix pfn_to_dma_pte to return NULL instead of tripping over BUG_ON.

With the attached patch a program using VFIO will get an EFAULT from the corresponding ioctl call and a helpful message in the kernel log (from intel_iommu_map).

Comment 5 Julian Stecklina 2013-08-21 10:56:34 UTC
I created an upstream bug: https://bugzilla.kernel.org/show_bug.cgi?id=60777

Comment 6 Josh Boyer 2013-08-21 15:49:22 UTC
(In reply to Julian Stecklina from comment #5)
> I created an upstream bug: https://bugzilla.kernel.org/show_bug.cgi?id=60777

Did you send this to the upstream maintainers?  Not everyone pays attention to kernel.org bugzilla, so it's likely best to email them the patch directly.

Comment 7 Julian Stecklina 2013-08-22 20:54:26 UTC
Will do.

Comment 8 Josh Boyer 2013-09-18 20:48:01 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 9 Josh Boyer 2013-10-08 18:21:19 UTC
(In reply to Julian Stecklina from comment #7)
> Will do.

Did you send this upstream then?  I don't see it in Linus' tree or in my lkml archives.

Comment 10 Julian Stecklina 2013-10-09 07:44:33 UTC
See http://lists.linuxfoundation.org/pipermail/iommu/2013-August/006411.html

I'll resubmit the patch.

Comment 11 Josh Boyer 2013-10-09 12:56:43 UTC
(In reply to Julian Stecklina from comment #10)
> See http://lists.linuxfoundation.org/pipermail/iommu/2013-August/006411.html
> 
> I'll resubmit the patch.

Great, thanks for the pointer.  I see that it has at least one upstream Ack, so I'll grab it for the next Fedora build as well.

Comment 12 Josh Boyer 2013-10-09 13:04:48 UTC
Added across the Fedora releases.  Thanks again.

Comment 13 Fedora Update System 2013-10-10 17:40:57 UTC
kernel-3.11.4-201.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.11.4-201.fc19

Comment 14 Fedora Update System 2013-10-10 17:41:23 UTC
kernel-3.11.4-101.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.11.4-101.fc18

Comment 15 Fedora Update System 2013-10-10 22:33:44 UTC
kernel-3.11.4-301.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.11.4-301.fc20

Comment 16 Fedora Update System 2013-10-11 02:32:41 UTC
Package kernel-3.11.4-201.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.11.4-201.fc19'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-18820/kernel-3.11.4-201.fc19
then log in and leave karma (feedback).

Comment 17 Fedora Update System 2013-10-13 19:55:43 UTC
kernel-3.11.4-301.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2013-10-14 07:10:52 UTC
kernel-3.11.4-201.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 19 Fedora Update System 2013-10-14 17:17:47 UTC
kernel-3.11.4-201.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 20 Fedora Update System 2013-10-18 19:31:33 UTC
kernel-3.11.4-101.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.