Description of problem: This happened while running apache nutch (a java program). In this particular case, the solrindex option was being executed. I cannot reproduce this problem at will; however, nutch usually seems to trigger it within 24 hours. Java from java-1.7.0-openjdk is being used here. I just installed the latest version of that package this morning (after this particular instance of this problem), but the problem has continued through a few recent versions of the java package. When this problem happens, it hangs one of the linux processes running an instance of the java jvm. I don't know whether it's useful information, but the overall procedure involves starting up a jvm, executing for a while (usually no more than 10 or 15 minues), shutting down the jvm, and repeating. This is possibly getting way off topic, but nutch is capable of running in a hadoop cluster. I have not explicitly configured any such thing and I do not know how similarly the default configuration behaves to a hadoop configuration nor whether it is even relevant to this problem. This instance of linux is being run as a guest on a fedora 17 host. The entire host has locked up completely a few times since I've been running nutch as well, but that doesn't usually happen when this kernel bug happens, so I don't know whether it's related. Additional info: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8118f4f4>] do_huge_pmd_wp_page+0x684/0xc20 PGD 11893d067 PUD 119242067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd i2c_piix4 soundcore virtio_balloon i2c_core virtio_net microcode virtio_blk CPU 1 Pid: 23373, comm: java Not tainted 3.8.8-202.fc18.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffff8118f4f4>] [<ffffffff8118f4f4>] do_huge_pmd_wp_page+0x684/0xc20 RSP: 0018:ffff880117dd7cd8 EFLAGS: 00010246 RAX: ffff880110d07000 RBX: ffff8801167a8000 RCX: 0000000000a131fa RDX: 0000000000a131f9 RSI: 00000000000000d0 RDI: ffff880110d07000 RBP: ffff880117dd7d68 R08: 0000000000016c20 R09: 00007f42a2b25000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880117834e60 R13: ffff880110d07000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f42a4dfd700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000118a32000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process java (pid: 23373, threadinfo ffff880117dd6000, task ffff8801194b8000) Stack: ffff880117dd7d08 00007f42a2b25000 ffff880118a2b8a8 ffff880110d07000 ffff880118a2b8a8 ffff880110d07000 00007f42a2b25000 ffffffff8118ff2d 8000000031c000a5 00007f42a2a00000 00007f42a38aae88 ffffea00045828c0 Call Trace: [<ffffffff8118ff2d>] ? do_huge_pmd_anonymous_page+0x34d/0x450 [<ffffffff8115f17e>] handle_mm_fault+0x17e/0x650 [<ffffffff81093f93>] ? try_to_wake_up+0x203/0x2d0 [<ffffffff81657181>] __do_page_fault+0x181/0x4f0 [<ffffffff816574fe>] do_page_fault+0xe/0x10 [<ffffffff81656c85>] do_async_page_fault+0x35/0x90 [<ffffffff81653b48>] async_page_fault+0x28/0x30 Code: c0 48 89 45 98 4c 8b 8d 78 ff ff ff 0f 84 f7 04 00 00 48 8b 7d 98 45 31 d2 4c 89 75 90 4c 89 4d a0 45 89 d6 48 89 7d 88 49 89 fd <4d> 8b 07 48 8b 4d a0 31 f6 4c 89 e2 bf da 00 82 00 49 c1 e8 37 RIP [<ffffffff8118f4f4>] do_huge_pmd_wp_page+0x684/0xc20 RSP <ffff880117dd7cd8> CR2: 0000000000000000
Created attachment 739905 [details] File: dmesg
we've seen a few reports similar to this trace, with no real explanation for it yet. Is there anything sensitive in the VM that prevents you from sharing it ? If not, any chance you could prepare a slimmed down virt image and put it up somewhere so I could try to reproduce this with ?
I can probably prepare a smaller image. I'll get back to you when I have something.
If there's still interest, I finally have a smaller image. It's about 1 GB compressed. The problem seemed to go away when I cut further than that. I don't have an internet-accessible place to put this long term, but could probably stash it in an S3 bucket for a short period.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 18 kernel bugs. Fedora 18 has now been rebased to 3.11.4-101.fc18. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 19, and are still experiencing this issue, please change the version to Fedora 19. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. It has been over a month since we asked you to test the 3.11 kernel updates and let us know if your issue has been resolved or is still a problem. When this happened, the bug was set to needinfo. Because the needinfo is still set, we assume either this is no longer a problem, or you cannot provide additional information to help us resolve the issue. As a result we are closing with insufficient data. If this is still a problem, we apologize, feel free to reopen the bug and provide more information so that we can work towards a resolution If you experience different issues, please open a new bug report for those.
I offered a VM, but got no response. At this point, this is all probably irrelevant, so I'm happy to have this bug closed, but I recently started getting notifications about information being required, so I'm trying to eliminate those. If this response isn't sufficient, what do I need to do?