Bug 732279
Summary: | mlx4_core 0000:86:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x00000000fe181000] [size=4096 bytes] | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Albert Strasheim <fullung> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 16 | CC: | gansalmon, itamar, jforbes, jonathan, kernel-maint, madhu.chinakonda |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-11-14 15:23:48 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Albert Strasheim
2011-08-21 15:19:14 UTC
I think the "vpd r/w failed" errors are due to a BIOS bug which is exposed by udev loading modules together. I don't think it makes a difference to whether this warning is emitted. It seems this has happened before: http://copilotco.com/mail-archives/ofa.2009/msg04432.html This still happens with the latest 3.2.3 debug kernel rpm. WARNING: at lib/dma-debug.c:966 check_sync+0x2a8/0x530() mlx4_core 0000:02:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x00000017839c1000] [size=4096 bytes] [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. looks fixed. By the way, apparently pcie_aspm=off or pcie_aspm=performance fixes the VPD issue, according to Supermicro. Saw this again on a machine with a mlx4 with older firmware with 3.3.0-4 debug. False alarm maybe. I see now it was 3.2.7-debug on this machine, not 3.3.0-4. Will retest. Retested with 3.3.0-4.fc16.x86_64.debug on this machine. Looks fixed. Argh. I wasn't looking properly. Finally, 3.3.0-4.fc16.x86_64.debug still has this. [ 40.693654] ------------[ cut here ]------------ [ 40.695188] WARNING: at lib/dma-debug.c:966 check_sync+0x2a8/0x530() [ 40.697642] Hardware name: SUN BLADE X6270 SERVER MODULE [ 40.706231] mlx4_core 0000:0d:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000102c981000] [size=4096 bytes] [ 40.730140] Modules linked in: ib_ipoib ib_cm ib_addr ib_sa ib_uverbs ib_umad ib_mad ib_core ipmi_poweroff ipmi_watchdog ipmi_devintf i2c_i801 mptsas(+) iTCO_wdt mptscsih igb i2c_core joydev microcode iTCO_vendor_support i7core_edac mlx4_core(+) mptbase ioatdma dca edac_core scsi_transport_sas ipmi_si ipmi_msghandler [ 40.787764] Pid: 257, comm: modprobe Not tainted 3.3.0-4.fc16.x86_64.debug #1 [ 40.790549] Call Trace: [ 40.806210] [<ffffffff81061f6f>] warn_slowpath_common+0x7f/0xc0 [ 40.808336] [<ffffffff81062066>] warn_slowpath_fmt+0x46/0x50 [ 40.827621] [<ffffffff8133ff88>] check_sync+0x2a8/0x530 [ 40.829762] [<ffffffff81340492>] debug_dma_sync_single_for_cpu+0x42/0x50 [ 40.847063] [<ffffffff8133c0fc>] ? is_swiotlb_buffer+0x3c/0x50 [ 40.849194] [<ffffffff8133c918>] ? swiotlb_sync_single+0x38/0x80 [ 40.868855] [<ffffffff8133ca5c>] ? swiotlb_sync_single_for_cpu+0xc/0x10 [ 40.905296] [<ffffffffa0095d3a>] __mlx4_write_mtt+0xea/0x1e0 [mlx4_core] [ 40.909217] [<ffffffffa0095f5c>] mlx4_write_mtt+0x12c/0x170 [mlx4_core] [ 40.925092] [<ffffffffa008a9bd>] mlx4_create_eq+0x4ad/0x6e0 [mlx4_core] [ 40.926982] [<ffffffffa008b24f>] mlx4_init_eq_table+0x1ff/0x6b0 [mlx4_core] [ 40.947112] [<ffffffffa00916d7>] mlx4_setup_hca+0x167/0x530 [mlx4_core] [ 40.964978] [<ffffffff811a26ec>] ? kfree+0x28c/0x2a0 [ 40.966813] [<ffffffffa0092342>] __mlx4_init_one+0x8a2/0xca0 [mlx4_core] [ 40.985583] [<ffffffffa009f5df>] mlx4_init_one+0x3d/0x42 [mlx4_core] [ 40.987441] [<ffffffff813498dc>] local_pci_probe+0x5c/0xd0 [ 41.005550] [<ffffffff8134b1d9>] pci_device_probe+0x109/0x130 [ 41.008078] [<ffffffff8141216c>] driver_probe_device+0x9c/0x300 [ 41.025620] [<ffffffff8141247b>] __driver_attach+0xab/0xb0 [ 41.027444] [<ffffffff814123d0>] ? driver_probe_device+0x300/0x300 [ 41.046544] [<ffffffff814104fe>] bus_for_each_dev+0x5e/0x90 [ 41.048703] [<ffffffff81411d6e>] driver_attach+0x1e/0x20 [ 41.065678] [<ffffffff81411960>] bus_add_driver+0x1c0/0x2b0 [ 41.067490] [<ffffffffa00b005b>] ? mlx4_catas_init+0x5b/0x5b [mlx4_core] [ 41.086974] [<ffffffff814129f6>] driver_register+0x76/0x140 [ 41.104564] [<ffffffff813325a8>] ? __raw_spin_lock_init+0x38/0x70 [ 41.106723] [<ffffffffa00b005b>] ? mlx4_catas_init+0x5b/0x5b [mlx4_core] [ 41.125441] [<ffffffff8134ae96>] __pci_register_driver+0x66/0xe0 [ 41.127858] [<ffffffffa00b005b>] ? mlx4_catas_init+0x5b/0x5b [mlx4_core] [ 41.145308] [<ffffffffa00b0107>] mlx4_init+0xac/0xfa5 [mlx4_core] [ 41.147163] [<ffffffff8100203f>] do_one_initcall+0x3f/0x170 [ 41.166884] [<ffffffff810dbb82>] sys_init_module+0xc82/0x21f0 [ 41.169044] [<ffffffff816abb29>] system_call_fastpath+0x16/0x1b [ 41.186223] ---[ end trace 2a085dfdc60385a8 ]--- Are you still seeing this on a 3.4 or 3.5 debug kernel? I'm guessing probably, but it would be good to know. I'll try to boot a machine with the latest debug kernel over the weekend. # Mass update to all open bugs. Kernel 3.6.2-1.fc16 has just been pushed to updates. This update is a significant rebase from the previous version. Please retest with this kernel, and let us know if your problem has been fixed. In the event that you have upgraded to a newer release and the bug you reported is still present, please change the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. If you are not the original bug reporter and you still experience this bug, please file a new report, as it is possible that you may be seeing a different problem. (Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient). With no response, we are closing this bug under the assumption that it is no longer an issue. If you still experience this bug, please feel free to reopen the bug report. |