Hide Forgot
Created attachment 516318 [details] host crash analyze info Description of problem: install a (or a few, which is easier to reproduce) overcommitted guest , and quit it when guest ending installation. host may crash Version-Release number of selected component (if applicable): 2.6.32-174.el6.x86_64 How reproducible: 2 / 25 2.6.32-174.el6.x86_64 2 / 20+ 2.6.32-171.el6.x86_64 0 / 20 2.6.32-170.el6.x86_64 Steps to Reproduce: 1. install a overcommitted guest (say host = 32G , guest = 33G) 2. quit guest by type "q" in monitor 3. Actual results: host crash Expected results: guest quit and host work well Additional info: host info : qemu-kvm-0.12.1.2-2.175.el6.x86_64 32G 12 cpus : processor : 11 vendor_id : AuthenticAMD cpu family : 16 model : 8 model name : Six-Core AMD Opteron(tm) Processor 2427
cmd to start guest: #qemu-kvm -name rhel61_32_ins -monitor stdio -chardev socket,id=serial_id_20110802-084213-3dmc,path=/tmp/serial-20110802-084213-3dmc,server,nowait -device isa-serial,chardev=serial_id_20110802-084213-3dmc -drive file=/images/RHEL-Server-6.1-32.qcow2,index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,format=qcow2,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device e1000,netdev=idQVRARF,mac=9a:a4:46:90:43:4e,id=ndev00idQVRARF,bus=pci.0,addr=0x3 -netdev tap,id=idQVRARF,vhost=on,ifname=t0-083dmc,script=/qemu-ifup-switch,downscript=no\ \ -m 33792 -smp 12,cores=1,threads=1,sockets=12 \ \ -drive file=/RHEL6.1-Server-i386.iso,index=1,if=none,id=drive-ide0-0-1,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file=/rhel61-32/ks.iso,index=2,if=none,id=drive-ide0-1-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -cpu cpu64-rhel6,+sse2,+x2apic -kernel /rhel61-32/vmlinuz -initrd /rhel61-32/initrd.img -vnc :10 -rtc base=utc,clock=host,driftfix=none -M rhel6.1.0 -boot order=cdn,once=n,menu=off -usbdevice tablet -no-kvm-pit-reinjection --append 'ks=cdrom nicdelay=60 console=ttyS0,115200 console=tty0' -enable-kvm
KERNEL: /usr/lib/debug/lib/modules/2.6.32-174.el6.x86_64/vmlinux DUMPFILE: analized.vmcore [PARTIAL DUMP] CPUS: 12 DATE: Tue Aug 2 08:56:12 2011 UPTIME: 23:23:05 LOAD AVERAGE: 1.05, 1.07, 0.65 TASKS: 355 NODENAME: amd-2427-32-2.englab.nay.redhat.com RELEASE: 2.6.32-174.el6.x86_64 VERSION: #1 SMP Thu Jul 28 00:31:11 EDT 2011 MACHINE: x86_64 (2199 Mhz) MEMORY: 32 GB PANIC: "kernel BUG at mm/mmap.c:2346!" PID: 22872 COMMAND: "qemu" TASK: ffff8804133d2080 [THREAD_INFO: ffff8804023fc000] CPU: 1 STATE: TASK_RUNNING (PANIC) crash>
meet same problem when using kernel-2.6-32-184 KERNEL: /usr/lib/debug/lib/modules/2.6.32-184.el6.x86_64/vmlinux DUMPFILE: kdump_analyzing/kern-184 [PARTIAL DUMP] CPUS: 2 DATE: Tue Aug 16 00:54:20 2011 UPTIME: 17:42:08 LOAD AVERAGE: 1.24, 0.90, 0.84 TASKS: 155 NODENAME: amd-5400b-4-3.englab.nay.redhat.com RELEASE: 2.6.32-184.el6.x86_64 VERSION: #1 SMP Tue Aug 9 12:20:06 EDT 2011 MACHINE: x86_64 (2805 Mhz) MEMORY: 3.9 GB PANIC: "Oops: 0000 [#1] SMP " (check log for details) PID: 1405 COMMAND: "rpciod/1" TASK: ffff880118fd6100 [THREAD_INFO: ffff880117b4e000] CPU: 1 STATE: TASK_RUNNING (PANIC)
Created attachment 518417 [details] kdump part0
Created attachment 518419 [details] kdump part1
Where is the crash bt output?
Created attachment 518590 [details] foreach bt > foreach_bt.txt
(In reply to comment #7) > Where is the crash bt output? Hi Cai Qian: Attached foreach bt output FYI: the bt output and other detail info is in the folder. download the attached "kdump part(part0,part1)" , cat ..part0 part1 > kdump.tb2 tar xjf kdump.tb2 Best Regards, Xiaoqing.
I am interested in the panic in the comment #3. PANIC: "kernel BUG at mm/mmap.c:2346!" The panic in the comment #4 looks like a known issue - bug 730756, as I read from the log-m.txt from the attachment. <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000400 <1>IP: [<ffffffffa03b17f1>] __br_deliver+0x61/0x100 [bridge] Are you able to reproduce the first panic using the latest kernel?
(In reply to comment #10) Hi Cai Qian: > I am interested in the panic in the comment #3. > PANIC: "kernel BUG at mm/mmap.c:2346!" > > The panic in the comment #4 looks like a known issue - bug 730756, as I read > from the log-m.txt from the attachment. > Actually, I dont really sure that the above two crashes are the same issue,their outputs are very different while they were triggered when running same job. > <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000400 > <1>IP: [<ffffffffa03b17f1>] __br_deliver+0x61/0x100 [bridge] > > Are you able to reproduce the first panic using the latest kernel? It's not always reproduciable, I will try more to reproduce :) Thanks and Best Regards, Xiaoqing.
> > I am interested in the panic in the comment #3. > > PANIC: "kernel BUG at mm/mmap.c:2346!" In addition, do you have vmcore available or logs for this panic to have a look?
Feel free to re-open if you can reproduce the kernel BUG at mm/mmap.c:2346! panic on the latest kernel. *** This bug has been marked as a duplicate of bug 724037 ***
re-open and assign to Andrew according to Dor's comment > ok, we will re-open it if mm/mmap.c:2346! crash is reproduced. Please assign Andrew for it
This bug appears to be reporting crashes that have been reported in two other bugs. The one that we were to focus on (comment 3) looks very much like a dup of bug 724037. So why was this reopened? Does it reproduce with 2.6.32-182?
Suqin, See comment 16, why has this bug been reopened? Is there still an issue with latest RHEL6 builds? I believe the issue (comment 3) that this bug was reporting has been resolved and this bug was correctly duped. Drew
(In reply to comment #17) > Suqin, > > See comment 16, why has this bug been reopened? Is there still an issue with > latest RHEL6 builds? I believe the issue (comment 3) that this bug was > reporting has been resolved and this bug was correctly duped. > > Drew I re-open it according to Dor's comment in the email: "Please assign Andrew for it"
(In reply to comment #18) > (In reply to comment #17) > > Suqin, > > > > See comment 16, why has this bug been reopened? Is there still an issue with > > latest RHEL6 builds? I believe the issue (comment 3) that this bug was > > reporting has been resolved and this bug was correctly duped. > > > > Drew > > I re-open it according to Dor's comment in the email: "Please assign Andrew for > it" You should only have done that if I hadn't already dealt with it :-) I'm redupping. *** This bug has been marked as a duplicate of bug 724037 ***