Bug 110889
Summary: | SMP race fixes from rmap 15k are missing | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Martin Wilck <martin.wilck> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 9 | CC: | pfrields, riel |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
URL: | http://linuxvm.bkbits.net:8080/linux-2.4-rmap | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-01-05 03:44:16 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Martin Wilck
2003-11-25 10:04:12 UTC
Here is a sample panic: CPU: 0 EIP: 0060:[<c01496b2>] Tainted: P EFLAGS: 00010202 EIP is at rmqueue [kernel] 0x312 (2.4.20-20.9smp) eax: 01040088 ebx: 0000efd0 ecx: 00001000 edx: 000054c9 esi: c1000030 edi: c0343400 ebp: c1128c28 esp: c6233e80 ds: 0068 es: 0068 ss: 0068 Process Bonnie (pid: 2676, stackpage=c6233000) Stack: 00001000 c6232000 00000000 000044c9 000044c8 00000203 00000000 c0343400 c0343400 c0345924 00000001 00000001 c01497b7 c034592c 00000000 000001d2 00000000 c01498f1 c0345920 00000000 00000001 00000001 The bug happens in the DEBUG_LRU_PAGE() macro in rmqueue when it is found that the page flags (%eax) have the PG_inactive_dirty flag set. Here is another one, this time in lru_cache_del()/del_page_from_inactive_clean_list() (invalid next pointer in list) ==> next->prev=prev 0xc0145656 <__lru_cache_del+742>: mov %edx,0x4(%eax) *pde = 00000000 Oops: 0002 parport_pc lp parport autofs nfs lockd sunrpc e1000 keybdev mousedev hid input usb-ohci usbcore ext3 jbd aic79xx sd_mod scsi_mod CPU: 1 EIP: 0060:[<c0145656>] Not tainted EFLAGS: 00210206 EIP is at __lru_cache_del [kernel] 0x2e6 (2.4.20-20.9smp) eax: 00000000 ebx: c0344680 ecx: c1cc176c edx: 00000000 esi: c1cc1750 edi: 000001fe ebp: 00000000 esp: f6475e00 ds: 0068 es: 0068 ss: 0068 Process tdnum (pid: 5846, stackpage=f6475000) Stack: c1cc1750 00000000 c0145724 c1cc1750 c014904f 00200296 f6474000 c1cc1750 000001d6 c013c4b8 140ac000 00000000 f6474000 00000000 00000000 c1cc1750 00000000 000001fe c0344680 c014704f c1cc1750 000001f4 c0345840 c01477cc Call Trace: [<c0145724>] lru_cache_del [kernel] 0x44 (0xf6475e08)) [<c014904f>] __free_pages_ok [kernel] 0x3f (0xf6475e10)) [<c013c4b8>] wait_on_page_timeout [kernel] 0xc8 (0xf6475e24)) [<c014704f>] rebalance_laundry_zone [kernel] 0x11f (0xf6475e4c)) [<c01477cc>] rebalance_dirty_zone [kernel] 0x9c (0xf6475e5c)) [<c01478d5>] rebalance_inactive_zone [kernel] 0x85 (0xf6475e7c)) [<c0147988>] rebalance_inactive [kernel] 0x48 (0xf6475e9c)) [<c01479ef>] do_try_to_free_pages [kernel] 0x1f (0xf6475ec0)) [<c01480f1>] try_to_free_pages [kernel] 0x51 (0xf6475ed4)) [<c0149957>] __alloc_pages [kernel] 0x167 (0xf6475ee4)) [<c0156d2c>] generic_commit_write [kernel] 0x8c (0xf6475f00)) [<c013f1b4>] generic_file_write [kernel] 0x394 (0xf6475f24)) [<c0152e07>] sys_write [kernel] 0x97 (0xf6475f94)) [<c01098cf>] system_call [kernel] 0x33 (0xf6475fc0)) ==> prev->next=next 0xc0145659 <__lru_cache_del+745>: mov %eax,(%edx) ==> entry->next=entry->prev=NULL ; Just looked at 2.4.20-24.9, it does NOT include the fixes I mention above, as I had hoped. I am disappointed. This is a real bug that crashes real systems!!! Customer would like to know a bit more about expected time of fixing the bug. thanks Giuseppe 2.4.20-24.9 was released to fix the recent do_brk security bug, and no non-security fixes went into that tree. A seperate 'bug fix' update is going to be released very soon. I'll look into these patches for that update. |