Bug 928024

Summary: forcedeth DMA-API: device driver failed to check map error
Product: [Fedora] Fedora Reporter: John Reiser <jreiser>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: gansalmon, itamar, jforbes, jonathan, kernel-maint, madhu.chinakonda, mattia.meneguzzo+fedora, nhorman
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-04-12 14:14:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[PATCH] forcedeth: Do a dma_mapping_error check after skb_frag_dma_map
none
[PATCH] forcedeth: Do a dma_mapping_error check after skb_frag_dma_map none

Description John Reiser 2013-03-26 17:47:51 UTC
Description of problem: forcedeth triggers complaint from DMA-API checker.


Version-Release number of selected component (if applicable):
kernel-PAE-3.9.0-0.rc4.git0.1.fc19.i686 running on Athlon 64 (uniprocessor x86_64)

How reproducible:


Steps to Reproduce:
1. [perhaps] type ^S during "yum update" "Installing ..." from multi-user text console
2.
3.
  
Actual results: from syslog /var/log/messsages:
kernel: [17539.340285] ------------[ cut here ]------------
kernel: [17539.341012] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [17539.341012] Hardware name: MS-7125
kernel: [17539.341012] forcedeth 0000:00:0a.0: DMA-API: device driver failed to check map error[device address=0x0000000013c88000] [size=544 bytes] [mapped as page]
kernel: [17539.341012] Modules linked in: fuse ebtable_nat ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_hdmi snd_cmipci snd_mpu401_uart snd_hda_intel snd_intel8x0 snd_opl3_lib snd_ac97_codec gameport snd_hda_codec snd_rawmidi ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd k8temp soundcore serio_raw i2c_nforce2 forcedeth ata_generic pata_acpi nouveau video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_sil pata_amd sata_nv uinput
kernel: [17539.341012] Pid: 17340, comm: sshd Not tainted 3.9.0-0.rc4.git0.1.fc19.i686.PAE #1
kernel: [17539.341012] Call Trace:
kernel: [17539.341012]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [17539.341012]  [<c0701953>] check_unmap+0x493/0x960
kernel: [17539.341012]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [17539.341012]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [17539.341012]  [<f7eae8f2>] nv_unmap_txskb.isra.32+0x92/0x100 [forcedeth]
kernel: [17539.341012]  [<f7eaebb6>] nv_tx_done_optimized+0xe6/0x2a0 [forcedeth]
kernel: [17539.341012]  [<f7eb075c>] nv_napi_poll+0x5c/0x5b0 [forcedeth]
kernel: [17539.341012]  [<c092e1fd>] ? net_rx_action+0x7d/0x2e0
kernel: [17539.341012]  [<c04bd56e>] ? trace_hardirqs_on_caller+0x9e/0x170
kernel: [17539.341012]  [<c092e2b0>] net_rx_action+0x130/0x2e0
kernel: [17539.341012]  [<c045f369>] __do_softirq+0xc9/0x350
kernel: [17539.341012]  [<c045f2a0>] ? __hrtimer_tasklet_trampoline+0x20/0x20
kernel: [17539.341012]  <IRQ>  [<c096a651>] ? ip_finish_output+0x351/0x7c0
kernel: [17539.341012]  [<c045e8d4>] ? local_bh_enable+0xc4/0xe0
kernel: [17539.341012]  [<c096a651>] ? ip_finish_output+0x351/0x7c0
kernel: [17539.341012]  [<c096a3fd>] ? ip_finish_output+0xfd/0x7c0
kernel: [17539.341012]  [<c096b812>] ? ip_output+0x82/0x130
kernel: [17539.341012]  [<c096a300>] ? ip_fragment+0x950/0x950
kernel: [17539.341012]  [<c096ac06>] ? ip_local_out+0x26/0x90
kernel: [17539.341012]  [<c096b036>] ? ip_queue_xmit+0x186/0x600
kernel: [17539.341012]  [<c096aeb0>] ? ip_build_and_send_pkt+0x240/0x240
kernel: [17539.341012]  [<c098320b>] ? tcp_transmit_skb+0x3bb/0x950
kernel: [17539.341012]  [<c0979d8c>] ? tcp_rearm_rto.part.55+0x8c/0xf0
kernel: [17539.341012]  [<c0983915>] ? tcp_write_xmit+0x175/0xa50
kernel: [17539.341012]  [<c0984428>] ? __tcp_push_pending_frames+0x38/0xd0
kernel: [17539.341012]  [<c05569ae>] ? might_fault+0x9e/0xb0
kernel: [17539.341012]  [<c09753e3>] ? tcp_sendmsg+0x103/0xc50
kernel: [17539.341012]  [<c09a13bc>] ? inet_sendmsg+0xbc/0x1f0
kernel: [17539.341012]  [<c09a13fb>] ? inet_sendmsg+0xfb/0x1f0
kernel: [17539.341012]  [<c09a1300>] ? inet_release+0x1e0/0x1e0
kernel: [17539.341012]  [<c0914893>] ? sock_aio_write+0xe3/0x100
kernel: [17539.341012]  [<c067fab2>] ? avc_has_perm_flags+0x22/0x310
kernel: [17539.341012]  [<c058a8b7>] ? do_sync_write+0x97/0xd0
kernel: [17539.341012]  [<c058b065>] ? vfs_write+0x135/0x150
kernel: [17539.341012]  [<c058b141>] ? sys_write+0x41/0x80
kernel: [17539.341012]  [<c0a5b08d>] ? sysenter_do_call+0x12/0x38
kernel: [17539.341012] ---[ end trace 011ab8f9bc2f22ce ]---
kernel: [17539.341012] Mapped at:
kernel: [17539.341012]  [<c0700485>] debug_dma_map_page+0x75/0x150
kernel: [17539.341012]  [<f7eb1582>] nv_start_xmit_optimized+0x3b2/0x680 [forcedeth]
kernel: [17539.341012]  [<c092f18e>] dev_hard_start_xmit+0x21e/0x630
kernel: [17539.341012]  [<c094d13a>] sch_direct_xmit+0x9a/0x320
kernel: [17539.341012]  [<c092f7b6>] dev_queue_xmit+0x216/0x880
sh[389]: abrt-dump-oops: Found oopses: 1
sh[389]: abrt-dump-oops: Creating problem directories
abrtd: Directory 'oops-2013-03-26-10:20:58-17453-1' creation detected
abrt-dump-oops: Reported 1 kernel oopses to Abrt
abrtd: Core backtrace is generated and saved, 2455 bytes
abrtd: Looking for kernel package
abrtd: Looking for PAE kernel
abrtd: Kernel package kernel-PAE-3.9.0-0.rc4.git0.1.fc19.i686 found
abrtd: New problem directory /var/tmp/abrt/oops-2013-03-26-10:20:58-17453-1, processing
/etc/gdm/Xsession[1161]: abrt-applet: '/var/tmp/abrt/oops-2013-03-26-10:20:58-17453-1' is not writable
abrtd: New client connected
fprintd[17396]: ** Message: No devices in use, exit
/etc/gdm/Xsession[1161]: (abrt:1465): libnotify-WARNING **: Failed to connect to proxy
/etc/gdm/Xsession[1161]: abrt-applet: Failed to receive server caps
/etc/gdm/Xsession[1161]: abrt-applet: Can't show notification: Timeout was reached



Expected results: no complaint


Additional info: Auto-reporting of the problem via 'abrt' started, but there was no confirmation.  Searching for 'forcedeth' gave no duplicates, neither did 'forcedeth' as initial Summary.

Comment 1 Neil Horman 2013-03-26 18:40:43 UTC
Created attachment 716670 [details]
[PATCH] forcedeth: Do a dma_mapping_error check after skb_frag_dma_map


This backtrace was recently reported on a 3.9 kernel:

Actual results: from syslog /var/log/messsages:
kernel: [17539.340285] ------------[ cut here ]------------
kernel: [17539.341012] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [17539.341012] Hardware name: MS-7125
kernel: [17539.341012] forcedeth 0000:00:0a.0: DMA-API: device driver failed to
check map error[device address=0x0000000013c88000] [size=544 bytes] [mapped as
page]
kernel: [17539.341012] Modules linked in: fuse ebtable_nat ipt_MASQUERADE
nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6
ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter
ip6_tables snd_hda_codec_hdmi snd_cmipci snd_mpu401_uart snd_hda_intel
snd_intel8x0 snd_opl3_lib snd_ac97_codec gameport snd_hda_codec snd_rawmidi
ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
k8temp soundcore serio_raw i2c_nforce2 forcedeth ata_generic pata_acpi nouveau
video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_sil pata_amd
sata_nv uinput
kernel: [17539.341012] Pid: 17340, comm: sshd Not tainted
3.9.0-0.rc4.git0.1.fc19.i686.PAE #1
kernel: [17539.341012] Call Trace:
kernel: [17539.341012]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [17539.341012]  [<c0701953>] check_unmap+0x493/0x960
kernel: [17539.341012]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [17539.341012]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [17539.341012]  [<f7eae8f2>] nv_unmap_txskb.isra.32+0x92/0x100

Its pretty plainly the result of an skb fragment getting unmapped without having
its initial mapping operation checked for errors.  This patch corrects that

Signed-off-by: Neil Horman <nhorman>
CC: "David S. Miller" <davem>
---
 drivers/net/ethernet/nvidia/forcedeth.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comment 2 Neil Horman 2013-03-26 18:45:34 UTC
The above (completely untested, but buildable), should fix the problem.  Please build a kernel with that and give it a try.  If it works well for you I'll send it upstream.

Thanks!

Comment 3 John Reiser 2013-03-27 02:31:15 UTC
OK, kernel has been patched, built and booted:
Mar 26 19:12:48 f19r32 kernel: [    0.000000] Linux version 3.9.0-0.rc4.git0.1.local.fc19.i686.PAE (jreiser@f19r32) (gcc version 4.8.0 20130322 (Red Hat 4.8.0-1) (GCC) ) #1 SMP Tue Mar 26 14:32:28 PDT 2013

There is no sign of the DMA-API complaint yet.  The patched kernel is running this Firefox sessio.  I guess I'll look for the DMA-API complaint from time to time.


[Incidentally, during the rpmbuild 'patch' complained about mac80211-Dont-restart-sta-timer-if-not-running.patch (Patch21276, rhbz #920218) being malformed because of ending in the middle of a line.  I fixed it by removing e-mail signature lines, so that the patch ends with:
-----
+               if (ieee80211_sdata_running(sdata))
+                       ieee80211_restart_sta_timer(sdata);
        rcu_read_unlock();
 }

----- EOF
]

Comment 4 Neil Horman 2013-03-27 12:58:40 UTC
Thank you John.  Do you have a mean time to failure estimate on this bug?  I figure if it lets you survive longer than the amount of time it takes to normally hit the bug, we can call it fixed.

Comment 5 John Reiser 2013-03-27 15:32:10 UTC
The estimated MTBF is 1 to 2 days for my usage.  So if the patched kernel lasts through daily "yum update" of tomorrow (Thurs.Mar.28) then that might be good.

Comment 6 Neil Horman 2013-03-27 15:57:58 UTC
ok, please update tomorrow afternoon.  If its good, I'll post it to netdev friday and backport it once its upstream.

Comment 7 John Reiser 2013-03-27 23:59:18 UTC
Here's another one (from abrt crash reporter!).  Is it from the same kernel location?

kernel: [31312.072993] ------------[ cut here ]------------
kernel: [31312.073007] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [31312.073010] Hardware name: MS-7125
kernel: [31312.073015] forcedeth 0000:00:0a.0: DMA-API: device driver failed to check map error[device address=0x000000002edd0000] [size=250 bytes] [mapped as page]
kernel: [31312.073017] Modules linked in: xfs btrfs zlib_deflate raid6_pq libcrc32c xor fuse ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_hdmi snd_cmipci snd_mpu401_uart snd_hda_intel snd_opl3_lib snd_intel8x0 gameport snd_hda_codec snd_ac97_codec snd_rawmidi ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore k8temp serio_raw i2c_nforce2 forcedeth ata_generic pata_acpi nouveau video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_nv sata_sil pata_amd uinput
kernel: [31312.073084] Pid: 19955, comm: reporter-urepor Not tainted 3.9.0-0.rc4.git0.1.local.fc19.i686.PAE #1
kernel: [31312.073084] Call Trace:
kernel: [31312.073084]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [31312.073084]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [31312.073084]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [31312.073084]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [31312.073084]  [<c0701953>] check_unmap+0x493/0x960
kernel: [31312.073084]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [31312.073084]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [31312.073084]  [<f7f378f2>] nv_unmap_txskb.isra.32+0x92/0x100 [forcedeth]
kernel: [31312.073084]  [<f7f37bb6>] nv_tx_done_optimized+0xe6/0x2a0 [forcedeth]
kernel: [31312.073084]  [<f7f39e1c>] nv_napi_poll+0x5c/0x5b0 [forcedeth]
kernel: [31312.073084]  [<c092e1fd>] ? net_rx_action+0x7d/0x2e0
kernel: [31312.073084]  [<c04bd56e>] ? trace_hardirqs_on_caller+0x9e/0x170
kernel: [31312.073084]  [<c092e2b0>] net_rx_action+0x130/0x2e0
kernel: [31312.073084]  [<c045f369>] __do_softirq+0xc9/0x350
kernel: [31312.073084]  [<c045f785>] irq_exit+0xa5/0xb0
kernel: [31312.073084]  [<c0418a05>] do_IRQ+0x45/0xb0
kernel: [31312.073084]  [<c0a5b5b8>] common_interrupt+0x38/0x40
kernel: [31312.073084]  [<c0441e33>] ? read_hpet+0x13/0x20
kernel: [31312.073084]  [<c04b140e>] ktime_get_ts+0x3e/0x100
kernel: [31312.073084]  [<c047a67f>] posix_ktime_get_ts+0xf/0x20
kernel: [31312.073084]  [<c047ba2f>] sys_clock_gettime+0x3f/0x80
kernel: [31312.073084]  [<c0a5b08d>] sysenter_do_call+0x12/0x38
kernel: [31312.073084] ---[ end trace 09874974c1e240c0 ]---
kernel: [31312.073084] Mapped at:
kernel: [31312.073084]  [<c0700485>] debug_dma_map_page+0x75/0x150
kernel: [31312.073084]  [<f7f3ddd2>] nv_start_xmit_optimized+0x3b2/0x680 [forcedeth]
kernel: [31312.073084]  [<c092f18e>] dev_hard_start_xmit+0x21e/0x630
kernel: [31312.073084]  [<c094d13a>] sch_direct_xmit+0x9a/0x320
kernel: [31312.073084]  [<c092f7b6>] dev_queue_xmit+0x216/0x880

Comment 8 John Reiser 2013-03-28 00:46:45 UTC
And another.  Of course this raises the question whether the patch is active.  The rpmbuild log says:
-----
+ ApplyPatch forcedeth-Do-a-dma_mapping_error-check-after-skb_frag_dma_map.patch
+ local patch=forcedeth-Do-a-dma_mapping_error-check-after-skb_frag_dma_map.patch
+ shift
+ '[' '!' -f /home/jreiser/rpmbuild/SOURCES/forcedeth-Do-a-dma_mapping_error-check-after-skb_frag_dma_map.patch ']'
Patch23020: forcedeth-Do-a-dma_mapping_error-check-after-skb_frag_dma_map.patch
+ case "$patch" in
+ patch -p1 -F1 -s
-----
so it looks like it made it.

kernel: [  923.646148] ------------[ cut here ]------------
kernel: [  923.647011] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [  923.647011] Hardware name: MS-7125
kernel: [  923.647011] forcedeth 0000:00:0a.0: DMA-API: device driver failed to check map error[device address=0x000000001dae0000] [size=544 bytes] [mapped as page]
kernel: [  923.647011] Modules linked in: ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_hdmi serio_raw k8temp snd_hda_intel snd_hda_codec snd_cmipci snd_mpu401_uart snd_opl3_lib snd_hwdep gameport forcedeth snd_rawmidi snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore i2c_nforce2 ata_generic pata_acpi nouveau video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_sil sata_nv pata_amd uinput
kernel: [  923.647011] Pid: 1449, comm: sshd Not tainted 3.9.0-0.rc4.git0.1.local.fc19.i686.PAE #1
kernel: [  923.647011] Call Trace:
kernel: [  923.647011]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [  923.647011]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [  923.647011]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [  923.647011]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [  923.647011]  [<c0701953>] check_unmap+0x493/0x960
kernel: [  923.647011]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [  923.647011]  [<c041d188>] ? sched_clock+0x8/0x10
kernel: [  923.647011]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [  923.647011]  [<f7fc68f2>] nv_unmap_txskb.isra.32+0x92/0x100 [forcedeth]
kernel: [  923.647011]  [<f7fc6bb6>] nv_tx_done_optimized+0xe6/0x2a0 [forcedeth]
kernel: [  923.647011]  [<f7fc8e1c>] nv_napi_poll+0x5c/0x5b0 [forcedeth]
kernel: [  923.647011]  [<c092e1fd>] ? net_rx_action+0x7d/0x2e0
kernel: [  923.647011]  [<c04bd56e>] ? trace_hardirqs_on_caller+0x9e/0x170
kernel: [  923.647011]  [<c092e2b0>] net_rx_action+0x130/0x2e0
kernel: [  923.647011]  [<c045f369>] __do_softirq+0xc9/0x350
kernel: [  923.647011]  [<c045f785>] irq_exit+0xa5/0xb0
kernel: [  923.647011]  [<c0418a05>] do_IRQ+0x45/0xb0
kernel: [  923.647011]  [<c04924c5>] ? local_clock+0x55/0x60
kernel: [  923.647011]  [<c0a5b5b8>] common_interrupt+0x38/0x40
kernel: [  923.647011]  [<c04bf57f>] ? lock_acquire+0x9f/0x1b0
kernel: [  923.647011]  [<c0556964>] ? might_fault+0x54/0xb0
kernel: [  923.647011]  [<c0556997>] might_fault+0x87/0xb0
kernel: [  923.647011]  [<c0556964>] ? might_fault+0x54/0xb0
kernel: [  923.647011]  [<c06ebc62>] _copy_from_user+0x32/0x60
kernel: [  923.647011]  [<c059cf57>] core_sys_select+0x177/0x450
kernel: [  923.647011]  [<c059ce07>] ? core_sys_select+0x27/0x450
kernel: [  923.647011]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [  923.647011]  [<c04ba5eb>] ? trace_hardirqs_off+0xb/0x10
kernel: [  923.647011]  [<c04924c5>] ? local_clock+0x55/0x60
kernel: [  923.647011]  [<c04bb157>] ? lock_release_holdtime.part.28+0x87/0xe0
kernel: [  923.647011]  [<c05ca8f2>] ? fsnotify+0x292/0x570
kernel: [  923.647011]  [<c05ca913>] ? fsnotify+0x2b3/0x570
kernel: [  923.647011]  [<c05ca6c6>] ? fsnotify+0x66/0x570
kernel: [  923.647011]  [<c058b00e>] ? vfs_write+0xde/0x150
kernel: [  923.647011]  [<c059d2a7>] sys_select+0x77/0xb0
kernel: [  923.647011]  [<c04bd5bc>] ? trace_hardirqs_on_caller+0xec/0x170
kernel: [  923.647011]  [<c0a5b08d>] sysenter_do_call+0x12/0x38
kernel: [  923.647011] ---[ end trace 7805049006c54510 ]---
kernel: [  923.647011] Mapped at:
kernel: [  923.647011]  [<c0700485>] debug_dma_map_page+0x75/0x150
kernel: [  923.647011]  [<f7fccdd2>] nv_start_xmit_optimized+0x3b2/0x680 [forcedeth]
kernel: [  923.647011]  [<c092f18e>] dev_hard_start_xmit+0x21e/0x630
kernel: [  923.647011]  [<c094d13a>] sch_direct_xmit+0x9a/0x320
kernel: [  923.647011]  [<c092f7b6>] dev_queue_xmit+0x216/0x880

Comment 9 Neil Horman 2013-03-28 14:17:03 UTC
Its active, I just missed a call site, I'll have another patch for you shortly.  Sorry about that.

Comment 10 Neil Horman 2013-03-28 14:37:12 UTC
Created attachment 717682 [details]
[PATCH] forcedeth: Do a dma_mapping_error check after skb_frag_dma_map


This backtrace was recently reported on a 3.9 kernel:

Actual results: from syslog /var/log/messsages:
kernel: [17539.340285] ------------[ cut here ]------------
kernel: [17539.341012] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x960()
kernel: [17539.341012] Hardware name: MS-7125
kernel: [17539.341012] forcedeth 0000:00:0a.0: DMA-API: device driver failed to
check map error[device address=0x0000000013c88000] [size=544 bytes] [mapped as
page]
kernel: [17539.341012] Modules linked in: fuse ebtable_nat ipt_MASQUERADE
nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_nat nf_nat_ipv6
ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
nf_conntrack bnep bluetooth rfkill ebtable_filter ebtables ip6table_filter
ip6_tables snd_hda_codec_hdmi snd_cmipci snd_mpu401_uart snd_hda_intel
snd_intel8x0 snd_opl3_lib snd_ac97_codec gameport snd_hda_codec snd_rawmidi
ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
k8temp soundcore serio_raw i2c_nforce2 forcedeth ata_generic pata_acpi nouveau
video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core sata_sil pata_amd
sata_nv uinput
kernel: [17539.341012] Pid: 17340, comm: sshd Not tainted
3.9.0-0.rc4.git0.1.fc19.i686.PAE #1
kernel: [17539.341012] Call Trace:
kernel: [17539.341012]  [<c045573c>] warn_slowpath_common+0x6c/0xa0
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c0701953>] ? check_unmap+0x493/0x960
kernel: [17539.341012]  [<c04557a3>] warn_slowpath_fmt+0x33/0x40
kernel: [17539.341012]  [<c0701953>] check_unmap+0x493/0x960
kernel: [17539.341012]  [<c049238f>] ? sched_clock_cpu+0xdf/0x150
kernel: [17539.341012]  [<c0701e87>] debug_dma_unmap_page+0x67/0x70
kernel: [17539.341012]  [<f7eae8f2>] nv_unmap_txskb.isra.32+0x92/0x100

Its pretty plainly the result of an skb fragment getting unmapped without having
its initial mapping operation checked for errors.  This patch corrects that

Signed-off-by: Neil Horman <nhorman>
CC: "David S. Miller" <davem>
---
 drivers/net/ethernet/nvidia/forcedeth.c | 41 ++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

Comment 11 Neil Horman 2013-03-28 14:55:38 UTC
new patch, with the error checking to the spot I missed.  Can you replace my previous patch with this one above and retest?  Thanks!

Comment 12 John Reiser 2013-03-29 16:08:49 UTC
A new kernel with the revised patch has survived two daily "yum update" without generating the complaint.  So the revised patch is ready for use by other people.

Comment 13 John Reiser 2013-03-29 20:44:50 UTC
Here's the same problem for e100 on real i686.

kernel: [  135.553283] ------------[ cut here ]------------
kernel: [  135.554028] WARNING: at lib/dma-debug.c:937 check_unmap+0x493/0x950()
kernel: [  135.554028] Hardware name: System Name
kernel: [  135.554028] e100 0000:02:0a.0: DMA-API: device driver failed to check map error[device address=0x0000000025b31a12] [size=90 bytes] [mapped as single]
kernel: [  135.554028] Modules linked in: ebtable_filter ebtables ip6table_filter ip6_tables iTCO_wdt iTCO_vendor_support gpio_ich microcode snd_cmipci snd_mpu401_uart snd_opl3_lib snd_hwdep gameport snd_rawmidi snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore i2c_i801 lpc_ich of_i2c uinput isofs squashfs radeon i2c_algo_bit drm_kms_helper ttm drm e100 mii i2c_core sunrpc
kernel: [  135.554028] Pid: 651, comm: nm-online Not tainted 3.9.0-0.rc4.git0.1.fc19.i686 #1
kernel: [  135.554028] Call Trace:
kernel: [  135.554028]  [<c0441aac>] warn_slowpath_common+0x6c/0xa0
kernel: [  135.554028]  [<c06ea623>] ? check_unmap+0x493/0x950
kernel: [  135.554028]  [<c06ea623>] ? check_unmap+0x493/0x950
kernel: [  135.554028]  [<c0441b13>] warn_slowpath_fmt+0x33/0x40
kernel: [  135.554028]  [<c06ea623>] check_unmap+0x493/0x950
kernel: [  135.554028]  [<c06eab47>] debug_dma_unmap_page+0x67/0x70
kernel: [  135.554028]  [<f7c42d78>] e100_tx_clean+0xc8/0x1c0 [e100]
kernel: [  135.554028]  [<c047e08f>] ? sched_clock_cpu+0xdf/0x150
kernel: [  135.554028]  [<f7c43672>] e100_poll+0x482/0x560 [e100]
kernel: [  135.554028]  [<c090df80>] net_rx_action+0x130/0x2e0
kernel: [  135.554028]  [<c044b6d9>] __do_softirq+0xc9/0x350
kernel: [  135.554028]  [<c047e08f>] ? sched_clock_cpu+0xdf/0x150
kernel: [  135.554028]  [<c044baf5>] irq_exit+0xa5/0xb0
kernel: [  135.554028]  [<c0404c65>] do_IRQ+0x45/0xb0
kernel: [  135.554028]  [<c04a92bc>] ? trace_hardirqs_on_caller+0xec/0x170
kernel: [  135.554028]  [<c0a39ff8>] common_interrupt+0x38/0x40
kernel: [  135.554028]  [<c04a007b>] ? print_tickdevice+0xcb/0x390
kernel: [  135.554028]  [<c0a31823>] ? _raw_spin_unlock_irqrestore+0x33/0x70
kernel: [  135.554028]  [<c0a28d9c>] __slab_free+0x5d/0x332
kernel: [  135.554028]  [<c08fcdf5>] ? skb_free_head+0x45/0x50
kernel: [  135.554028]  [<c04a934b>] ? trace_hardirqs_on+0xb/0x10
kernel: [  135.554028]  [<c04a9581>] ? debug_check_no_locks_freed+0xb1/0x150
kernel: [  135.554028]  [<c0561962>] kfree+0x262/0x290
kernel: [  135.554028]  [<c08fcdf5>] ? skb_free_head+0x45/0x50
kernel: [  135.554028]  [<c08fcdf5>] ? skb_free_head+0x45/0x50
kernel: [  135.554028]  [<c08fcdf5>] skb_free_head+0x45/0x50
kernel: [  135.554028]  [<c08fff34>] skb_release_data+0xb4/0xc0
kernel: [  135.554028]  [<c08fff57>] __kfree_skb+0x17/0x90
kernel: [  135.554028]  [<c090002b>] consume_skb+0x2b/0x120
kernel: [  135.554028]  [<c09bb0e3>] unix_stream_recvmsg+0x3b3/0x7c0
kernel: [  135.554028]  [<c066f602>] ? sock_has_perm+0x112/0x200
kernel: [  135.554028]  [<c08f4686>] sock_aio_read+0x106/0x140
kernel: [  135.554028]  [<c066a1b2>] ? avc_has_perm_flags+0x22/0x310
kernel: [  135.554028]  [<c0574c67>] do_sync_read+0x97/0xd0
kernel: [  135.554028]  [<c057538d>] vfs_read+0x12d/0x150
kernel: [  135.554028]  [<c0575541>] sys_read+0x41/0x80
kernel: [  135.554028]  [<c0a39acd>] sysenter_do_call+0x12/0x38
kernel: [  135.554028] ---[ end trace b0a5e47c7942c78b ]---
kernel: [  135.554028] Mapped at:
kernel: [  135.554028]  [<c06e91a3>] debug_dma_map_page+0x63/0x130
kernel: [  135.554028]  [<f7c42385>] e100_xmit_prepare+0xf5/0x160 [e100]
kernel: [  135.554028]  [<f7c40870>] e100_exec_cb+0x70/0x120 [e100]
kernel: [  135.554028]  [<f7c42423>] e100_xmit_frame+0x33/0x170 [e100]
kernel: [  135.554028]  [<c090ee5e>] dev_hard_start_xmit+0x21e/0x630

Comment 14 Neil Horman 2013-03-30 12:39:40 UTC
John, Glad to hear that the forcedeth driver fix is good.  I'll square that away upstream and in fedora on monday.  As for the e100 issue, please submit a separate bug for that, as it will be a separate driver fix.  You're welcome to assign it directly to me if you like.  Thanks!

Comment 15 Neil Horman 2013-04-01 14:36:10 UTC
http://marc.info/?l=linux-netdev&m=136482676209692&w=2

Sent the patch upstream, we'll get it backported once its accepted

Comment 16 Justin M. Forbes 2013-04-11 19:53:30 UTC
Testing it now, I can reproduce the bug in a matter of seconds with trinity. If it works, I will get it into the next rawhide and f19 builds.

Comment 17 Neil Horman 2013-04-11 20:12:15 UTC
Justin, its already in f19

Comment 18 Justin M. Forbes 2013-04-12 14:14:44 UTC
Ahh, indeed it is.  Surprised to see it in F19 and not rawhide.  Putting this in rawhide right now and we will close this out.

Comment 19 Neil Horman 2013-04-12 14:55:08 UTC
I didn't bother with rawhide as this bz was for f19.  I figured it'll get pulled into rawhide in due course on the next upstream pull.

Comment 20 Neil Horman 2013-05-01 19:11:00 UTC
*** Bug 951809 has been marked as a duplicate of this bug. ***