Bug 490306
Summary: | Tracebacks at lib/dma-debug.c:479 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bruno Wolff III <bruno> | ||||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | rawhide | CC: | kernel-maint, madko, quintela, sdodson | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2009-03-26 18:35:31 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Bruno Wolff III
2009-03-15 03:22:42 UTC
Created attachment 335241 [details]
lspci output
Possibly related to this is a problem I am only seeing on the machine with the kernel tracebacks and not on my machine at home that has pretty much the same packages on it (but i586 instead of x86_64). Logging in under gnome or running desktop applications that bring of file lists (such as saving a downloaded file in firefox) under xfce breaks the system. Sometimes X is totally locked up, other times I can get to a VT, but login attempts fail between entering the login name and getting a password prompt. Also in that case there are some log messages about dm-0 being read only and problems with directory entries and inodes. Nothing gets left in the logs after forcably rebooting the system. I retested the gnome issue with the hope of getting some sysrq info. I didn't work like I thought, but by waiting for my time delayed sysrq commands to run eventually gnome died and I got access to a VT session I had had running. I couldn't save anything to any of the file systems. A lot of commands didn't work, dying with bus errors or segfaults. I noticed that none of the /dev/sd devices were listed any more. Though stuff from device mapper was. I tried putting in an encrypted usb stick and that resulted in /dev/sdc and /dev/sdc1 being created as normal, but I couldn't successfully run cryptsetup. There could also be a relation to a problem I started seeing a while back with smartctl. Using it on a device without forcing it to treat the drives as scsi instead of sas would knock out the device similar to what I saw with the gnome issue. However since I am using raid 1, I could still keep doing stuff. Though a reboot was needed to fix things. So while something changed relatively recently with how this problem gets triggered, I suspect the underlying issue has been there for a while. Also I saw that an issue with mptsas in 2.6.29 was mentioned on lmkl, though I think that was taken care of in rc3. The change in triggering may be related to Devicekit. Bug 489397 was closed with a fix that seems to be tied to dm devices and matches up with when I was having lockouts occur. I'll be testing this on Monday. If gnome session startup no longer locks up my machine the severity of the underlying issue for me will be greatly reduced. I got the same thing during installation but it's not happening post install. Smolt profile: http://www.smolts.org/client/show/pub_d0cee8ac-9106-4efc-8460-8b2d43a9b780 This seems very similar to Bug 487894, not sure if it's a dupe or not. ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:479 check_unmap+0x2b4/0x3dd() (Not tainted) Hardware name: MS-7125 sky2 0000:03:00.0: DMA-API: device driver frees DMA memory with wrong function [device address=0x00000 0007aca9000] [size=209 bytes] [mapped as page] [unmapped as single] Modules linked in: ata_generic nouveau drm i2c_core sky2 sata_sil firewire_ohci firewire_core crc_itu_ t forcedeth pata_acpi sata_nv pata_amd iscsi_ibft iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 ext4 jbd2 crc16 squashfs pcspkr edd floppy nfs lockd nfs_acl auth_rpcgss sunrpc vfat fat cramfs Pid: 0, comm: swapper Not tainted 2.6.29-0.258.rc8.git2.fc11.x86_64 #1 Call Trace: <IRQ> [<ffffffff8104bae3>] warn_slowpath+0xbc/0xf0 [<ffffffff8106d828>] ? graph_unlock+0x6b/0x77 [<ffffffff81070eb3>] ? __lock_acquire+0xb67/0xc0d [<ffffffff811a6a5d>] ? get_hash_bucket+0x28/0x34 [<ffffffff81396b58>] ? _spin_lock_irqsave+0x7d/0x8b [<ffffffff811a6a5d>] ? get_hash_bucket+0x28/0x34 [<ffffffff8106f9d6>] ? mark_held_locks+0x68/0x86 [<ffffffff811a7308>] check_unmap+0x2b4/0x3dd [<ffffffff810db081>] ? kmem_cache_free+0xd3/0xed [<ffffffff811a757e>] debug_dma_unmap_page+0x50/0x52 [<ffffffffa01ea919>] dma_unmap_single+0x6c/0x75 [sky2] [<ffffffffa01ea95c>] pci_unmap_page+0x3a/0x57 [sky2] [<ffffffffa01eb2c5>] sky2_tx_complete+0xb7/0x153 [sky2] [<ffffffffa01eb5a9>] sky2_tx_done+0x2d/0x3a [sky2] [<ffffffffa01ed2b0>] sky2_poll+0x96f/0xabd [sky2] [<ffffffff812f8bad>] net_rx_action+0xb6/0x1ee [<ffffffff812f8c9c>] ? net_rx_action+0x1a5/0x1ee [<ffffffff81051353>] __do_softirq+0x94/0x179 [<ffffffff810127ac>] call_softirq+0x1c/0x30 [<ffffffff8101393e>] do_softirq+0x52/0xb9 [<ffffffff81050f76>] irq_exit+0x53/0x90 [<ffffffff81013c57>] do_IRQ+0x12c/0x151 [<ffffffff81011e93>] ret_from_intr+0x0/0x2e <EOI> [<ffffffff8102b4c8>] ? native_safe_halt+0xb/0xd [<ffffffff8106fbf9>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81017dd8>] ? default_idle+0x51/0x7c [<ffffffff81399c0b>] ? atomic_notifier_call_chain+0xf/0x11 [<ffffffff8101025d>] ? enter_idle+0x27/0x29 [<ffffffff810102c7>] ? cpu_idle+0x68/0xb3 [<ffffffff813819e7>] ? rest_init+0x6b/0x6d ---[ end trace 5f893f17106c5153 ]--- The message I remembered was actually on the patch list and if it was applied, would have been after rc3 since the complaint had it occuring in 2.6.29-0.66.rc3.fc11.x86_64. http://patchwork.kernel.org/patch/5960/ The DeviceKit-disks update didn't help with gnome triggering my disk devices getting removed. I still can't usefully use gnome. I have some dmesg output from after running smartctl -a -d sat /dev/sdb: mptscsih: ioc0: attempting task abort! (sc=ffff88005cc08780) sd 8:0:1:0: [sdb] CDB: ATA command pass through(16): 85 08 0e 00 00 00 01 00 00 00 00 00 00 00 ec 00 mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) mptsas: ioc0: removing sata device, channel 0, id 1, phy 1 port-8:1: mptsas: ioc0: delete port (1) sd 8:0:1:0: [sdb] Synchronizing SCSI cache mptscsih: ioc0: task abort: SUCCESS (sc=ffff88005cc08780) mptscsih: ioc0: attempting task abort! (sc=ffff88005cc083c0) sd 8:0:1:0: [sdb] CDB: Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00 mptscsih: ioc0: task abort: SUCCESS (sc=ffff88005cc083c0) mptscsih: ioc0: attempting target reset! (sc=ffff88005cc08780) sd 8:0:1:0: [sdb] CDB: ATA command pass through(16): 85 08 0e 00 00 00 01 00 00 00 00 00 00 00 ec 00 mptscsih: ioc0: target reset: SUCCESS (sc=ffff88005cc08780) mptscsih: ioc0: attempting bus reset! (sc=ffff88005cc08780) sd 8:0:1:0: [sdb] CDB: ATA command pass through(16): 85 08 0e 00 00 00 01 00 00 00 00 00 00 00 ec 00 mptscsih: ioc0: bus reset: SUCCESS (sc=ffff88005cc08780) mptscsih: ioc0: attempting host reset! (sc=ffff88005cc08780) mptbase: ioc0: Initiating recovery mptscsih: ioc0: host reset: SUCCESS (sc=ffff88005cc08780) sd 8:0:1:0: Device offlined - not ready after error recovery sd 8:0:1:0: Device offlined - not ready after error recovery end_request: I/O error, dev sdb, sector 168312775 md: super_written gets error=-5, uptodate=0 raid1: Disk failure on sdb3, disabling device. raid1: Operation continuing on 1 devices. raid1: sdb3: rescheduling sector 58984048 program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO sd 8:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO scsi 8:0:1:0: rejecting I/O to dead device RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda3 disk 1, wo:1, o:0, dev:sdb3 RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda3 raid1: sda3: redirecting sector 58984048 to another mirror scsi 8:0:1:0: rejecting I/O to dead device scsi 8:0:1:0: rejecting I/O to dead device end_request: I/O error, dev sdb, sector 299001472 md: super_written gets error=-5, uptodate=0 raid1: Disk failure on sdb6, disabling device. raid1: Operation continuing on 1 devices. RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda6 disk 1, wo:1, o:0, dev:sdb6 RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda6 Created attachment 336628 [details]
Dmesg log
Created attachment 336629 [details]
My smolt profile
Linux dhcp-lab-124.englab.brq.redhat.com 2.6.29-0.258.rc8.git2.fc11.x86_64 #1 SMP Mon Mar 16 20:53:26 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
*** This bug has been marked as a duplicate of bug 485778 *** |