+++ This bug was initially created as a clone of Bug #443651 +++ From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Description of problem: This has happened twice now, I haven't written a short program to force the situation but I will work on that now that I know it's not a one-off problem. I have to retype the kernel crash info by hand, so it may be incomplete. Please let me know if you need more specific information. Kernel BUG at mm/rmap.c:590! invalid opcode: 0000 [#1] SMP . . . CPU: 1 EIP: 0060:[<c0463166>] not tainted VLI EFLAGS: 00210286 (2.6.18-53el5PAE #1) Process <our application> (pid 18897 ... ) Call Trace: [<c045d30b>] unmap_vmas+0x2f5/0x58f [<c046044f>] exit_mmap+0x68/0xdf [<c0428be7>] do_exit+0x1eb/0x734 [<c0605f48>] do_page_fault+0x54f/0x5d3 [<c04e4381>] copy_to_user+0x31/0x48 [<c06059f9>] do_page_fault+0x0/0x5d3 [<c0405a71>] error_code+0x39/0x40 Code: ... EIP: [<c0463166>] page_remove_rmap+0x16/0x6d SS:ESP 0068:e5133ea8 I will work on a way to test this, if I find something easily reproducable I will update this bug Version-Release number of selected component (if applicable): kernel-2.6.18-53el5PAE How reproducible: Didn't try Steps to Reproduce: 1. 2. 3. Actual Results: Expected Results: Additional info: --- Additional comment from lwoodman on 2008-04-24 12:05:49 EDT --- Nothing obvious here, the page->_mapcount went negative!!! Could be a real logic bug or corruption in the page structure. Please let me know ASAP if this is reproducable. Larry Woodman --- Additional comment from lwoodman on 2008-04-24 12:08:45 EDT --- Also, if you can get a crashdump file that would be great. Larry --- Additional comment from lwoodman on 2008-08-26 13:54:18 EDT --- I added debug code in RHEL5-U3 to the BUG() statement encountered above so that more debugging data gets printed to the console. While this will not fix the problem it will help us debug it if it is encountered agian. --------------------------------------------------------------------------------- void page_remove_rmap(struct page *page) { if (atomic_add_negative(-1, &page->_mapcount)) { if (unlikely(page_mapcount(page) < 0)) { printk (KERN_EMERG "Eeek! page_mapcount(page) went negative! (%d)\n", page_mapcount(page)); printk (KERN_EMERG " page->flags = %lx\n", page->flags); printk (KERN_EMERG " page->count = %x\n", page_count(page)); printk (KERN_EMERG " page->mapping = %p\n", page->mapping); BUG(); } ---------------------------------------------------------------------------------
At this point the page structure looks corrupt. The _count, _mapcount, index and flags dont make sense. The page is about to be unmapped and freed yet the PG_lru is not set and the counts are already too low. Is this something that happens and can be reproduced or did it only happen once? If it is reproducible a description of how to reproduce it would be very helpful. Larry Woodman
Larry, Would it benefit to have them run the instrumentation from BZ443651 in page_remove_rmap()? Bill Internal Status set to 'Waiting on Support' This event sent from IssueTracker by bbraswel issue 219976
Yes. Larry