Bug 127515 (mark_clean)
Summary: | undef ENABLE_MARK_CLEAN in arch/ia64/hp/common/sba_iommu.c? | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Don Howard <dhoward> | ||||||
Component: | kernel | Assignee: | Don Howard <dhoward> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 2.1 | CC: | cww, jparadis, ltroan, mike.miller, riel, tao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | ia64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-04-28 15:10:06 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 132992 | ||||||||
Attachments: |
|
Description
Don Howard
2004-07-09 07:38:18 UTC
<0>Kernel panic: not continuing In interrupt handler - not syncing <6>Syncing device 68:04 ... kernel BUG at sched.c:834! Unable to handle kernel NULL pointer dereferencecp[2095]: Oops 11003706212352 --> schedule [kernel] 0x81 <-- Pid: 2095, comm: cp psr : 0000121008026018 ifs : 8000000000000813 ip : [<e000000004470781>] Not tainted unat: 0000000000000000 pfs : 0000000000000813 rsc : 0000000000000003 rnat: 0000000000001000 bsps: e0000040dadf8000 pr : 80000000ff615565 ldrs: 0000000000000000 ccv : 000000007fffffff fpsr: 0009804c8a70033f b0 : e000000004470780 b6 : e0000000045e8d40 b7 : e00000000440e2b0 f6 : 0fffbccccccccc8c00000 f7 : 0ffdca200000000000000 f8 : 100028000000000000000 f9 : 10002a000000000000000 r1 : e000000004bb2310 r2 : 00000000000051d7 r3 : e00000000485d2d5 r8 : 000000000000001b r9 : 0000000000000000 r10 : 0000000000000000 r11 : 80000000ff611a65 r12 : e0000040d674f970 r13 : e0000040d6748000 r14 : 0000000000000000 r15 : e00000000495da20 r16 : e00000000495da08 r17 : 0000000000000000 r18 : 0000000000000001 r19 : e000000004a13750 r20 : e000000004a13748 r21 : e0000000049bdc58 r22 : 000000000000ffff r23 : 0000000000000000 r24 : 0000000000000058 r25 : 0000000000000059 r26 : 000000000000005a r27 : 00000000000000e0 r28 : 0000000000000000 r29 : 0000000000000001 r30 : 0000000000000005 r31 : 0000000000000894 Call Trace: [<e000000004412d90>] sp=0xe0000040d674f560 bsp=0xe0000040d67498d0 decoded to show_stack [kernel] 0x50 [<e0000000044135c0>] sp=0xe0000040d674f720 bsp=0xe0000040d6749878 decoded to show_regs [kernel] 0x7c0 [<e00000000442c7e0>] sp=0xe0000040d674f740 bsp=0xe0000040d6749850 decoded to die [kernel] 0x120 [<e00000000444bc20>] sp=0xe0000040d674f740 bsp=0xe0000040d67497e8 decoded to ia64_do_page_fault [kernel] 0x780 [<e00000000440dce0>] sp=0xe0000040d674f7d0 bsp=0xe0000040d67497e8 decoded to ia64_leave_kernel [kernel] 0x0 [<e000000004470780>] sp=0xe0000040d674f970 bsp=0xe0000040d6749750 decoded to schedule [kernel] 0x80 [<e0000000044d9750>] sp=0xe0000040d674f980 bsp=0xe0000040d6749710 decoded to __wait_on_buffer [kernel] 0xf0 [<e0000000044dc800>] sp=0xe0000040d674f9b0 bsp=0xe0000040d67496e8 decoded to bread [kernel] 0xe0 [<e000000004540540>] sp=0xe0000040d674f9c0 bsp=0xe0000040d6749688 decoded to ext2_update_inode [kernel] 0x2e0 [<e000000004540cb0>] sp=0xe0000040d674f9d0 bsp=0xe0000040d6749668 decoded to ext2_write_inode [kernel] 0x30 [<e000000004507d60>] sp=0xe0000040d674f9d0 bsp=0xe0000040d6749600 decoded to sync_inodes_sb [kernel] 0x2c0 [<e000000004508740>] sp=0xe0000040d674f9d0 bsp=0xe0000040d67495e0 decoded to sync_inodes [kernel] 0x60 [<e0000000044da260>] sp=0xe0000040d674f9d0 bsp=0xe0000040d67495c8 decoded to fsync_dev [kernel] 0x40 [<e0000000045ef270>] sp=0xe0000040d674f9d0 bsp=0xe0000040d6749598 decoded to go_sync [kernel] 0x370 [<e0000000045ef460>] sp=0xe0000040d674f9e0 bsp=0xe0000040d6749568 decoded to do_emergency_sync [kernel] 0x180 [<e000000004478320>] sp=0xe0000040d674f9e0 bsp=0xe0000040d6749510 decoded to panic [kernel] 0x300 [<e00000000442c870>] sp=0xe0000040d674fa20 bsp=0xe0000040d67494e8 decoded to die [kernel] 0x1b0 [<e00000404764b580>] sp=0xe0000040d674fa20 bsp=0xe0000040d67494c0 decoded to vlan_ioctl_hook_R570c4b11 [] 0x42c8dae0 [<e00000404764b580>] sp=0xe0000040d674fa20 bsp=0xe0000040d6749498 decoded to vlan_ioctl_hook_R570c4b11 [] 0x42c8dae0 [<e00000404764b580>] sp=0xe0000040d674fa20 bsp=0xe0000040d6749470 decoded to vlan_ioctl_hook_R570c4b11 [] 0x42c8dae0 ... Unable to handle kernel paging request at virtual address 3030203030303030 kswapd[5]: Oops 8813272891392 --> kmem_cache_reap [kernel] 0x570 <-- Pid: 5, comm: kswapd psr : 0000101008022038 ifs : 8000000000000c1a ip : [<e0000000044d0b50>] Not tainted unat: 0000000000000000 pfs : 0000000000000c1a rsc : 0000000000000003 rnat: 80000000ff602939 bsps: e0000000044141c0 pr : 80000000ff602979 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a74433f b0 : e0000000044d0970 b6 : e0000000044141c0 b7 : e00000000440d990 f6 : 0fff6fffffffff0000000 f7 : 0ffe7b800000000000000 f8 : 1000bb800000000000000 f9 : 100078000000000000000 r1 : e000000004cf5760 r2 : 0000000000000000 r3 : e000004046a37d98 r8 : 0000000000000017 r9 : ffffffffffffffff r10 : 0000000000000000 r11 : 0000000000000a98 r12 : e000004046a37e20 r13 : e000004046a30000 r14 : 3030203030303030 r15 : e00000404722b210 r16 : 0000000000000032 r17 : 000000000000242b r18 : e0000040d3608008 r19 : 0000000000000000 r20 : 0000000000000000 r21 : 0000000066666667 r22 : 0000000000000000 r23 : 0000000000000000 r24 : ffffffffffff781a r25 : e0000040fef68058 r26 : 0000000000000000 r27 : e000004046a37e30 r28 : e000004046a37e38 r29 : 0000000000000001 r30 : 0000000000000000 r31 : 0000000000000000 Call Trace: [<e000000004414910>] sp=0xe000004046a37a10 bsp=0xe000004046a313a0 decoded to show_stack [kernel] 0x50 [<e000000004415140>] sp=0xe000004046a37bd0 bsp=0xe000004046a31348 decoded to show_regs [kernel] 0x7c0 [<e00000000442fad0>] sp=0xe000004046a37bf0 bsp=0xe000004046a31320 decoded to die [kernel] 0x190 [<e000000004452580>] sp=0xe000004046a37bf0 bsp=0xe000004046a312c0 decoded to ia64_do_page_fault [kernel] 0x780 [<e00000000440df20>] sp=0xe000004046a37c80 bsp=0xe000004046a312c0 decoded to ia64_leave_kernel [kernel] 0x0 [<e0000000044d0b50>] sp=0xe000004046a37e20 bsp=0xe000004046a311e8 decoded to kmem_cache_reap [kernel] 0x570 [<e0000000044d8820>] sp=0xe000004046a37e30 bsp=0xe000004046a311c8 decoded to do_try_to_free_pages [kernel] 0xa0 [<e0000000044d9150>] sp=0xe000004046a37e30 bsp=0xe000004046a311a0 decoded to kswapd [kernel] 0x330 [<e000000004415f30>] sp=0xe000004046a37e50 bsp=0xe000004046a31168 decoded to arch_kernel_thread [kernel] 0x70 [<e000000004484010>] sp=0xe000004046a37e50 bsp=0xe000004046a31138 decoded to kernel_thread [kernel] 0xd0 [<e0000000048caf30>] sp=0xe000004046a37e50 bsp=0xe000004046a31128 decoded to kswapd_init [kernel] 0x50 ... Opening bug to HP per Summer to request help from HP Engineering. Reference Issue Trackers 42071 (HP L3 escalation), 44090 (HP-IPF) ia64 i-caches are not coherent with respect to processor stores. Â So in general, when mapping an executable page, we have to flush the i-cache to avoid executing stale instructions. Â This flush normally happens in update_mmu_cache(). However, the i-cache IS coherent with respect to DMA. Â So if we DMA over an entire page and subsequently map it as executable, we can skip the flush. Â mark_clean() performs this optimization by setting the PG_arch_1 bit. Â update_mmu_cache() skips the i-cache flush if PG_arch_1 is set. I expect that the effectiveness of this optimization depends on the percentage of DMA-read pages that are subsequently mapped executable. If very few of them are ever executed (as is probably the case for Oracle), the time spent doing mark_clean() is wasted. On the other hand, if we're often reading executable pages from the disk, we can do a lot of mark_clean()s for the cost of a cache flush, so it's probably a win overall. The system should operate correctly either with or without mark_clean(). For general-purpose use, I think we want to keep it, but it might be worthwhile to consider a tunable for systems where almost all DMA reads are for non-executable data. While looking at the code, I noticed that RHEL3 U3 calls mark_clean() while holding the ioc->res_lock(), which is not needed (this is fixed in 2.6 already). Â Before adding a tunable, I'd propose moving the mark_clean() outside the critical section to make sure it's not just a lock contention problem they're seeing. Hope this helps. Do we have any preliminary patches for us to build test kernels with, for the customer to test, apart from Don's untested patch? Created attachment 108021 [details]
Disable mark_clean in sba_iommu.c to avoid memory corruption.
I am currently working to get this included in the U7 update. This seems interesting: derry isn't specifying GFP_DMA when allocating pages for just that: dma. I would expect IO failure, rather than the hangs and oppses that are described, so I'm unsure if it would relate here. [snipped from derry vs taroon diff of sba_iommu.c] @@ -947,7 +972,7 @@ sba_alloc_consistent(struct pci_dev *hwd return 0; } - ret = (void *) __get_free_pages(GFP_ATOMIC, get_order(size)); + ret = (void *) __get_free_pages(GFP_ATOMIC|GFP_DMA, get_order(size)); if (ret) { memset(ret, 0, size); @ Larry Woodman pointed out to me that omission of GFP_DMA could potentially cause trouble on ia64 machines that lack an iommu, but most (all?) of the reports I've seen of this are on hp hardware that has an iommu. Is this IN or OUT of U7? The patch that disables mark_clean() is in U7. A fix for this problem has just been committed to the RHEL2.1 U7 patch pool this evening (in kernel version 2.4.18-54.1). Make that kernel version 2.4.18-55... An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-284.html |