Description of problem: Restart f11-x86_64 by "system_reset" in monitor during guest installation, guest got kernel panic. Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99 (9.1) (kvm-83-83.el5) How reproducible: 100% Steps to Reproduce: 1. install the F11-x86_64 guest from cdrom: /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive file=fedora11-64.qcow2,if=ide -cpu qemu64,+sse2 -vnc :16 -net nic,vlan=0,macaddr=20:20:20:00:49:58 -net tap,vlan=0,script=/mnt/images/qemu-ifup -m 1024 -cdrom Fedora-11-x86_64-DVD.iso -boot d 2. type ' system_reset ' into qemu monitor 3. vm restart. redo the installation Actual results: Guest: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(202,17) screen shot will be attached. Expected results: Additional info: this problem only happened with f11-x86_64 in following host: host1 CPU: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz host2 cpu: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz Tried in one AMD host, this problem does not exist. And f11-i386 does not have this problem on both Intel and AMD.
Created attachment 350911 [details] screenshot of kernel panic
This problem is found only in fedora-x86-64. Tried with 32bitrhel5.3s and 64bit rhel5.3s, they do not have this problem.
How about 5.4?
(In reply to comment #3) > How about 5.4? both 32bitrhel5.4s and 64bitrhel5.4s do not have this problem.
Can you retest with latest f11?
Test with latest f11 in Red Hat Enterprise Virtualization Hypervisor release 5.4-2.1 (1) (kvm-83-105.el5_4.9), this problem still exists. CMD: /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive file=fedora11-64.qcow2,if=ide -cpu qemu64,+sse2 -vnc :16 -net nic,vlan=0,macaddr=20:20:20:00:49:58 -net tap,vlan=0,script=qemu-ifup -m 1024 -cdrom Fedora-11-x86_64-DVD.iso -smp 2 -boot d&
(In reply to comment #0) > Steps to Reproduce: > 1. install the F11-x86_64 guest from cdrom: > /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive > file=fedora11-64.qcow2,if=ide -cpu qemu64,+sse2 -vnc :16 -net > nic,vlan=0,macaddr=20:20:20:00:49:58 -net > tap,vlan=0,script=/mnt/images/qemu-ifup -m 1024 -cdrom Fedora-11-x86_64-DVD.iso It's probably not the cause but please use standard cli: http://cleo.tlv.redhat.com/qumrawiki/KVM/ > -boot d > 2. type ' system_reset ' into qemu monitor When do you call it? Is it when install finished and you need to restart? > 3. vm restart. redo the installation Why redo?
(In reply to comment #7) > (In reply to comment #0) > > > Steps to Reproduce: > > 1. install the F11-x86_64 guest from cdrom: > > /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive > > file=fedora11-64.qcow2,if=ide -cpu qemu64,+sse2 -vnc :16 -net > > nic,vlan=0,macaddr=20:20:20:00:49:58 -net > > tap,vlan=0,script=/mnt/images/qemu-ifup -m 1024 -cdrom Fedora-11-x86_64-DVD.iso > > It's probably not the cause but please use standard cli: > http://cleo.tlv.redhat.com/qumrawiki/KVM/ > > > -boot d > > 2. type ' system_reset ' into qemu monitor > > When do you call it? Is it when install finished and you need to restart? 'system_reset' is called in the installation process and it is not finished. > > > 3. vm restart. redo the installation > > Why redo? Found that when guest is started with 4G mem, this problem does not exist. (-m 4G) when guest is started with less than 4G mem, this problem exist. (-m 2G)
I am unable to reproduce here. My procesor are way older than the ones in the test: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz ... flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi flexpriority bogomips : 4388.59 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: This is the more similar processor that I have around (only dual core and model 15 vs model 23 of reproducer). My comand line is: [root@deus ~]# /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive file=/mnt/images/10-i386.img,if=ide -cpu qemu64,+sse2 -vnc :0 -net nic,vlan=0,macaddr=20:20:20:00:49:58 -net tap,vlan=0,script=/etc/kvm-ifup -m 1024 -cdrom /mnt/images/iso/Fedora-11-x86_64-DVD.iso -monitor stdio -boot d (only changes are that I am telling that monitor is on stdio and changing paths of cdrom/file/....). Chen, can you check that problem still exist? or give me access to a machine where that is reproducible? Later, Juan.
found a core i7 machine that I can use. I can reproduce the bug there (RHEL5.4). Disabling ept (modprobe kvm-intel enable-etp=0 makes no difference) I continue investigating.
Created attachment 378363 [details] Error with ept disabled
Created attachment 378485 [details] 1st run with 1G memory
Created attachment 378486 [details] dmesg 2nd run with 1G memory
Created attachment 378487 [details] dmesg 1st run with 4G
Created attachment 378488 [details] dmesg 2nd run with 4G
I upladed the dmesg of the before (1st) and after (2nd) for 1G and 4G of RAM. The interesting difference in 1G (and don't appear on 4G) are this two lines: checking if image is initramfs... it is bad: checking if image is initramfs...it isn't (junk in compressed archive); looks like an initrd on 4G both are the 1st place. There is also a difference (in both 1G and 4G) about: pci 0000:00:01.0: PIIX3: Enabling Passive Release only happening the 2nd time. But it makes no difference with 4G. We tried this patch: commit 15a1956af94e36105494f782a752698103addf63 Author: Gleb Natapov <gleb> Date: Wed Jun 17 19:32:01 2009 +0300 Call piix3_reset() on system reset. Also zero pci_irq_levels on reset to avoid stuck irq after reset. Signed-off-by: Gleb Natapov <gleb> Signed-off-by: Yaniv Kamay <ykamay> But it don't fix the problem either.
Luckily for us rhel5.5 version of qemu doesn't have this bug, so bug was easily bisectable. Commit that fixes the problem is d0c1a4bbc1f05f9057122b3efcb66e349fdfb70a (BZ #531701). We should Z-stream it.
Yeah. After taking a look at this, this patch does indeed have the side effect of restoring kvmclock MSR to a default value of disabled. This is desired, to keep kernel from writting to an arbitrary piece of memory.
So, is this a dip of #531701?
I wouldn't call it dup. The patch just accidentally fixed another issue.
It is not a real dup of 531701. But as its solution fixes it, I am closing it as a duplicate of that bug, just to point to the patch that fixed the problem. *** This bug has been marked as a duplicate of bug 531701 ***