Description of problem: when running guestmount in nested RHEL vm on vmware vSphere, the action fails with: qemu-kvm: /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: kvm_put_msrs: Assertion `ret == n' failed. This does not happen if kvm_intel module is not loaded, but in that case other error messages are displayed: $ LIBGUESTFS_BACKEND=direct guestmount -a /home/stack/images/overcloud-full.qcow2 -i /mnt/stack --verbose 2>&1 | grep KVM Could not access KVM kernel module: No such file or directory failed to initialize KVM: No such file or directory Version-Release number of selected component (if applicable): VMWare vSphere 6.0.0 RHEL 7.3 host VM kernel-3.10.0-514.16.1.el7.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64 libvirt-2.0.0-10.el7_3.9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Setup latest RHEL 7.3 VM with "Hardware virtualization" enabled on VMWare hypervisor 2. Register VM and attach to the pool that ptovides qemu-kvm-rhev (either OpenStack or RHV subscriptopn is required) 3. yum install libvirt qemu-kvm-rhev libguestfs libguestfs-tools 4. mkdir /mnt/stack 5. modprobe kvm_intel 6. Download rhel-guest-image-7.3-35.x86_64.qcow2 from Customer Portal to that host VM 7. LIBGUESTFS_BACKEND=direct guestmount -a ./rhel-guest-image-7.3-35.x86_64.qcow2 -i /mnt/stack --verbose 2>&1 | grep ^qemu-kvm Actual results: qemu-kvm process fails tp stary with the following error: qemu-kvm: /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: kvm_put_msrs: Assertion `ret == n' failed. Expected results: qemu-kvm process starts up without errors Additional info: This bug is not reproducible if the host VM is running on QEMU/KVM RHEL hypervisor, but it's 100% reproducible if the host VM is running on VMWare hypervisor -- Kind Regards, Igor Netkachev Technical Support Engineer Red Hat Global Support Services
(In reply to Igor Netkachev from comment #0) > Description of problem: > when running guestmount in nested RHEL vm on vmware vSphere, the action > fails with: > qemu-kvm: /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: > kvm_put_msrs: Assertion `ret == n' failed. I don't remember the vmware hypervisor environment very much, is there still a cos terminal or something ? Is there a easy way to get a list of cpu flags in the host and is there a way for the host hypervisor not to advertise arch_perfmon to the guest ? > This does not happen if kvm_intel module is not loaded, but in that case > other error messages are displayed: > > $ LIBGUESTFS_BACKEND=direct guestmount -a > /home/stack/images/overcloud-full.qcow2 -i /mnt/stack --verbose 2>&1 | grep > KVM > Could not access KVM kernel module: No such file or directory > failed to initialize KVM: No such file or directory > > > > Version-Release number of selected component (if applicable): > VMWare vSphere 6.0.0 > RHEL 7.3 host VM > kernel-3.10.0-514.16.1.el7.x86_64 > qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64 > libvirt-2.0.0-10.el7_3.9.x86_64 > > How reproducible: > 100% > > Steps to Reproduce: > 1. Setup latest RHEL 7.3 VM with "Hardware virtualization" enabled on VMWare > hypervisor > 2. Register VM and attach to the pool that ptovides qemu-kvm-rhev (either > OpenStack or RHV subscriptopn is required) > 3. yum install libvirt qemu-kvm-rhev libguestfs libguestfs-tools > 4. mkdir /mnt/stack > 5. modprobe kvm_intel > 6. Download rhel-guest-image-7.3-35.x86_64.qcow2 from Customer Portal to > that host VM > 7. LIBGUESTFS_BACKEND=direct guestmount -a > ./rhel-guest-image-7.3-35.x86_64.qcow2 -i /mnt/stack --verbose 2>&1 | grep > ^qemu-kvm > > Actual results: > qemu-kvm process fails tp stary with the following error: > qemu-kvm: /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: > kvm_put_msrs: Assertion `ret == n' failed. > > > Expected results: > qemu-kvm process starts up without errors > > Additional info: > This bug is not reproducible if the host VM is running on QEMU/KVM RHEL > hypervisor, but it's 100% reproducible if the host VM is running on VMWare > hypervisor > > > -- > Kind Regards, > Igor Netkachev > Technical Support Engineer > Red Hat Global Support Services
*** Bug 1463502 has been marked as a duplicate of this bug. ***
hitting the same issue on centos 7: trying to run Triple-O installation nested, unsuccessfully due to this bug: libguestfs-test-tool ************************************************************ * IMPORTANT NOTICE * * When reporting bugs, include the COMPLETE, UNEDITED * output below in your bug report. * ************************************************************ libguestfs: trace: set_verbose true libguestfs: trace: set_verbose = 0 libguestfs: trace: set_backend "direct" libguestfs: trace: set_backend = 0 libguestfs: trace: set_verbose true libguestfs: trace: set_verbose = 0 LIBGUESTFS_DEBUG=1 LIBGUESTFS_BACKEND=direct LIBGUESTFS_TRACE=1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin SELinux: Enforcing libguestfs: trace: add_drive_scratch 104857600 libguestfs: trace: get_tmpdir libguestfs: trace: get_tmpdir = "/tmp" libguestfs: trace: disk_create "/tmp/libguestfsx7NNMa/scratch.1" "raw" 104857600 libguestfs: trace: disk_create = 0 libguestfs: trace: add_drive "/tmp/libguestfsx7NNMa/scratch.1" "format:raw" "cachemode:unsafe" libguestfs: trace: add_drive = 0 libguestfs: trace: add_drive_scratch = 0 libguestfs: trace: get_append libguestfs: trace: get_append = "NULL" guestfs_get_append: (null) libguestfs: trace: get_autosync libguestfs: trace: get_autosync = 1 guestfs_get_autosync: 1 libguestfs: trace: get_backend libguestfs: trace: get_backend = "direct" guestfs_get_backend: direct libguestfs: trace: get_backend_settings libguestfs: trace: get_backend_settings = [] guestfs_get_backend_settings: [] libguestfs: trace: get_cachedir libguestfs: trace: get_cachedir = "/var/tmp" guestfs_get_cachedir: /var/tmp libguestfs: trace: get_direct libguestfs: trace: get_direct = 0 guestfs_get_direct: 0 libguestfs: trace: get_hv libguestfs: trace: get_hv = "/usr/libexec/qemu-kvm" guestfs_get_hv: /usr/libexec/qemu-kvm libguestfs: trace: get_memsize libguestfs: trace: get_memsize = 500 guestfs_get_memsize: 500 libguestfs: trace: get_network libguestfs: trace: get_network = 0 guestfs_get_network: 0 libguestfs: trace: get_path libguestfs: trace: get_path = "/usr/lib64/guestfs" guestfs_get_path: /usr/lib64/guestfs libguestfs: trace: get_pgroup libguestfs: trace: get_pgroup = 0 guestfs_get_pgroup: 0 libguestfs: trace: get_program libguestfs: trace: get_program = "libguestfs-test-tool" guestfs_get_program: libguestfs-test-tool libguestfs: trace: get_recovery_proc libguestfs: trace: get_recovery_proc = 1 guestfs_get_recovery_proc: 1 libguestfs: trace: get_smp libguestfs: trace: get_smp = 1 guestfs_get_smp: 1 libguestfs: trace: get_tmpdir libguestfs: trace: get_tmpdir = "/tmp" guestfs_get_tmpdir: /tmp libguestfs: trace: get_trace libguestfs: trace: get_trace = 1 guestfs_get_trace: 1 libguestfs: trace: get_verbose libguestfs: trace: get_verbose = 1 guestfs_get_verbose: 1 host_cpu: x86_64 Launching appliance, timeout set to 600 seconds. libguestfs: trace: launch libguestfs: trace: version libguestfs: trace: version = <struct guestfs_version = major: 1, minor: 32, release: 7, extra: rhel=7,release=3.el7_3.3,libvirt, > libguestfs: trace: get_backend libguestfs: trace: get_backend = "direct" libguestfs: launch: program=libguestfs-test-tool libguestfs: launch: version=1.32.7rhel=7,release=3.el7_3.3,libvirt libguestfs: launch: backend registered: unix libguestfs: launch: backend registered: uml libguestfs: launch: backend registered: libvirt libguestfs: launch: backend registered: direct libguestfs: launch: backend=direct libguestfs: launch: tmpdir=/tmp/libguestfsx7NNMa libguestfs: launch: umask=0022 libguestfs: launch: euid=0 libguestfs: trace: get_backend_setting "force_tcg" libguestfs: trace: get_backend_setting = NULL (error) libguestfs: trace: get_cachedir libguestfs: trace: get_cachedir = "/var/tmp" libguestfs: begin building supermin appliance libguestfs: run supermin libguestfs: command: run: /usr/bin/supermin5 libguestfs: command: run: \ --build libguestfs: command: run: \ --verbose libguestfs: command: run: \ --if-newer libguestfs: command: run: \ --lock /var/tmp/.guestfs-0/lock libguestfs: command: run: \ --copy-kernel libguestfs: command: run: \ -f ext2 libguestfs: command: run: \ --host-cpu x86_64 libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d libguestfs: command: run: \ -o /var/tmp/.guestfs-0/appliance.d supermin: version: 5.1.16 supermin: rpm: detected RPM version 4.11 supermin: package handler: fedora/rpm supermin: acquiring lock on /var/tmp/.guestfs-0/lock supermin: if-newer: output does not need rebuilding libguestfs: finished building supermin appliance libguestfs: begin testing qemu features libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -display none libguestfs: command: run: \ -help libguestfs: qemu version 2.6 libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -display none libguestfs: command: run: \ -machine accel=kvm:tcg libguestfs: command: run: \ -device ? libguestfs: finished testing qemu features libguestfs: trace: get_backend_setting "gdb" libguestfs: trace: get_backend_setting = NULL (error) [00171ms] /usr/libexec/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -nodefconfig \ -enable-fips \ -nodefaults \ -display none \ -machine accel=kvm:tcg \ -cpu host \ -m 500 \ -no-reboot \ -rtc driftfix=slew \ -no-hpet \ -global kvm-pit.lost_tick_policy=discard \ -kernel /var/tmp/.guestfs-0/appliance.d/kernel \ -initrd /var/tmp/.guestfs-0/appliance.d/initrd \ -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-pci,rng=rng0 \ -device virtio-scsi-pci,id=scsi \ -drive file=/tmp/libguestfsx7NNMa/scratch.1,cache=unsafe,format=raw,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/var/tmp/.guestfs-0/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \ -device scsi-hd,drive=appliance \ -device virtio-serial-pci \ -serial stdio \ -device sga \ -chardev socket,path=/tmp/libguestfsx7NNMa/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -append 'panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm' qemu-kvm: /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: kvm_put_msrs: Assertion `ret == n' failed. libguestfs: error: appliance closed the connection unexpectedly, see earlier error messages libguestfs: child_cleanup: 0x7f94990357d0: child process died libguestfs: sending SIGTERM to process 7056 libguestfs: error: /usr/libexec/qemu-kvm killed by signal 6 (Aborted), see debug messages above libguestfs: error: guestfs_launch failed, see earlier error messages libguestfs: trace: launch = -1 (error) libguestfs-test-tool: failed to launch appliance libguestfs: trace: close libguestfs: closing guestfs handle 0x7f94990357d0 (state 0) libguestfs: command: run: rm libguestfs: command: run: \ -rf /tmp/libguestfsx7NNMa
tried on vmware esxi 5.5 and 6.0 and 6.1 , all same results
(In reply to korenlev from comment #8) > tried on vmware esxi 5.5 and 6.0 and 6.1 , all same results David (dgilbert) pointed me to https://bugs.launchpad.net/qemu/+bug/1661386. I am assuming you are the reporter of that bug ? I agree with Paolo's reasoning in there. If ESX is L0, it's upto it to advertise PMU correctly and not fail read/write for whatever pmu msrs L1 is trying to access. In L1, we can try to disable pmu if we detect a nested environment but I think that's something that can be easily done with -cpu host,pmu=off. Can you try running an upstream version of Qemu(>2.9) in L1 ? David tells me that newer version of Qemu print a message when some msr operation fails.
With esx being L0, the bug is in how it emulates PMU msrs. I believe there is a easy workaround by not advertising pmu as mentioned in one of the comments above.