Bug 1576306
Summary: | Wake up from hibernation failed:internal error: unable to execute QEMU command 'cont': Failed to get "write" lock. | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Israel Pinto <ipinto> | ||||||||||||
Component: | BLL.Storage | Assignee: | Tal Nisan <tnisan> | ||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | Elad <ebenahar> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 4.2.3.2 | CC: | bugs, famz, ipinto, kwolf, tnisan | ||||||||||||
Target Milestone: | --- | Keywords: | Automation | ||||||||||||
Target Release: | --- | Flags: | ipinto:
planning_ack?
ipinto: devel_ack? ipinto: testing_ack? |
||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2018-05-14 14:46:17 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Israel Pinto
2018-05-09 08:09:04 UTC
Created attachment 1433607 [details]
host-var/run logs
Created attachment 1433608 [details]
host-var/log logs
Created attachment 1433609 [details]
engine log
Created attachment 1433614 [details]
last engine log
Israel, can you please send me the mount output for the domain? mount output: [root@master-vds10 mnt]# mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=10145176k,nr_inodes=2536294,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuacct,cpu) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_prio,net_cls) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/mapper/VolGroup01-root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=303) debugfs on /sys/kernel/debug type debugfs (rw,relatime) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) hugetlbfs on /dev/hugepages1G type hugetlbfs (rw,relatime,seclabel,pagesize=1G) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) tmpfs on /tmp type tmpfs (rw,seclabel) /dev/sda1 on /boot type ext3 (rw,relatime,seclabel,data=ordered) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime) yellow-vdsb.qa.lab.tlv.redhat.com:/Compute_NFS/alukiano/compute-ge-he-1 on /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Compute__NFS_alukiano_compute-ge-he-1 type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.35.80.5,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.35.80.5) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=2450972k,mode=700) Kevin, I can't understand which lock fails here, could you advise? (In reply to Tal Nisan from comment #7) > Kevin, I can't understand which lock fails here, could you advise? Changing the NEEDINFO to Fam, who is responsible for the locking code. Besides the qemu-kvm command invocation error, I see qemu-img convert error in the logs too:
> error while converting qcow2: Failed to get "write" lock\\nIs another process using the image?\\n\')',)
I'm wondering what exactly happens in the test case. Is it possible that multiple QEMU commands (either qemu-kvm or qemu-img) are executed concurrently? That is the usual trigger of the locking error.
Is it possible to grab output of "ps aux" when the error is raised?
[root@master-vds10 ~]# ps -aux | grep qemu | grep memory qemu 17745 1.7 1.3 1890516 325164 ? Sl 16:46 0:10 /usr/libexec/qemu-kvm -name guest=memory_hotplug_test,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-265-memory_hotplug_test/master-key.aes -machine pc-i440fx-rhel7.5.0,accel=kvm,usb=off,dump-guest-core=off -cpu Conroe -m size=1048576k,slots=16,maxmem=16777216k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid 3e5c9189-4e59-456f-8277-eb40e47dc231 -smbios type=1,manufacturer=oVirt,product=RHEV Hypervisor,version=7.5-8.el7,serial=55157E90-C28C-11E0-983A-5CF3FC78F458,uuid=3e5c9189-4e59-456f-8277-eb40e47dc231 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-265-memory_hotplug_test/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2018-05-13T13:46:07,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=ua-5686e686-fe5a-427c-bfe4-cf6f9889ecd2,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=ua-1396692c-e614-4f77-abed-43236884e746,max_ports=16,bus=pci.0,addr=0x5 -drive if=none,id=drive-ua-0a6e36d3-853f-41ab-b466-0a2a8ddc2444,readonly=on,werror=report,rerror=report -device ide-cd,bus=ide.1,unit=0,drive=drive-ua-0a6e36d3-853f-41ab-b466-0a2a8ddc2444,id=ua-0a6e36d3-853f-41ab-b466-0a2a8ddc2444 -drive file=/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Compute__NFS_GE_compute-he-1_nfs__0/4bf1f6d3-b754-414e-85bc-78584a7f6b5d/images/946c884e-8318-4232-ae9a-b13a8cf9bfa4/4b473dd0-4bf2-438e-9fbc-f2602781e1cd,format=qcow2,if=none,id=drive-ua-946c884e-8318-4232-ae9a-b13a8cf9bfa4,serial=946c884e-8318-4232-ae9a-b13a8cf9bfa4,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua-946c884e-8318-4232-ae9a-b13a8cf9bfa4,id=ua-946c884e-8318-4232-ae9a-b13a8cf9bfa4,bootindex=1 -netdev tap,fd=33,id=hostua-bb80cd20-b3c9-447a-8087-a7d4831e4cbe,vhost=on,vhostfd=35 -device virtio-net-pci,netdev=hostua-bb80cd20-b3c9-447a-8087-a7d4831e4cbe,id=ua-bb80cd20-b3c9-447a-8087-a7d4831e4cbe,mac=00:1a:4a:16:10:49,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/3e5c9189-4e59-456f-8277-eb40e47dc231.ovirt-guest-agent.0,server,nowait -device virtserialport,bus=ua-1396692c-e614-4f77-abed-43236884e746.0,nr=1,chardev=charchannel0,id=channel0,name=ovirt-guest-agent.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/3e5c9189-4e59-456f-8277-eb40e47dc231.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=ua-1396692c-e614-4f77-abed-43236884e746.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=ua-1396692c-e614-4f77-abed-43236884e746.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5901,tls-port=5902,addr=10.35.92.21,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -device qxl-vga,id=ua-b6bbbf35-55a6-4bc3-ab48-39c0d9343bcf,ram_size=67108864,vram_size=33554432,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=ua-bdc1eb5e-98ab-432a-b00e-eb34ecd8f5f5,bus=pci.0,addr=0x4 -device hda-duplex,id=ua-bdc1eb5e-98ab-432a-b00e-eb34ecd8f5f5-codec0,bus=ua-bdc1eb5e-98ab-432a-b00e-eb34ecd8f5f5.0,cad=0 -object rng-random,id=objua-38d1b121-6070-449b-9cbc-93bf94daf1b5,filename=/dev/urandom -device virtio-rng-pci,rng=objua-38d1b121-6070-449b-9cbc-93bf94daf1b5,id=ua-38d1b121-6070-449b-9cbc-93bf94daf1b5,bus=pci.0,addr=0x8 -device vmcoreinfo -msg timestamp=on VM been paused due to storage I/O problem May 13, 2018, 4:25:12 PM VM memory_hotplug_test is down with error. Exit message: Wake up from hibernation failed:xml in virDomainDefineXML must not be NULL. adding logs Created attachment 1435682 [details]
logs_13_5_18
I recheck the settings of the NFS (nfs_0) on setup and found the following: 1. Retransmissions was set to 2 2. Timeout was set to 1 I rerun the test with suspend resume vm with hot-plug memory. The VM is UP after suspend in both cases (hot-plug and hot unplug memory) I think the error/s we saw can be cause by this settings (which were leftover on the setup) Seems like it indeed, closing as not a bug Clearing old needinfo. |