Bug 2097461
| Summary: | Qemu-kvm VM can not be started any more - "Failed to get write lock" | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Robert_Schulte |
| Component: | qemu-kvm | Assignee: | Hanna Czenczek <hreitz> |
| qemu-kvm sub component: | Storage | QA Contact: | Tingting Mao <timao> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | bstinson, coli, jinzhao, juzhang, jwboyer, kkiwi, kwolf, qinwang, timao, virt-maint |
| Version: | CentOS Stream | Keywords: | Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-06-30 02:16:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
1) Same behavior for new machines (can't be created). 2) I ruled out that the lock manager is the problem. First I had a "vanilla" install w/o making up my mind about this topic and not configuring, then, after running into this problem, I tried sanlock and an explicit (virt)lockd config. could you please execute to check VM areadly in running virsh list --all (In reply to qing.wang from comment #2) > could you please execute to check VM areadly in running > > virsh list --all I already did that before. No machine is running. Example output for a new machine created with virtual machine manager.
I just
- added the exiting image
- set 8GB of mem, 8 CPUs
- changed the mac
When i then click begin install, i get the following message
Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2022-06-15T17:46:09.436519Z qemu-kvm: -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}: Failed to get "write" lock
Is another process using the image [/var/lib/libvirt/images/QCOW10GB/D75.img]?'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/createvm.py", line 2001, in _do_async_install
installer.start_install(guest, meter=meter)
File "/usr/share/virt-manager/virtinst/install/installer.py", line 701, in start_install
domain = self._create_guest(
File "/usr/share/virt-manager/virtinst/install/installer.py", line 649, in _create_guest
domain = self.conn.createXML(install_xml or final_xml, 0)
File "/usr/lib64/python3.9/site-packages/libvirt.py", line 4393, in createXML
domain will be automatically destroyed when the virConnectPtr
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-15T17:46:09.436519Z qemu-kvm: -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}: Failed to get "write" lock
Is another process using the image [/var/lib/libvirt/images/QCOW10GB/D75.img]?
(In reply to Robert_Schulte from comment #4) > Example output for a new machine created with virtual machine manager. > > I just > > - added the exiting image > - set 8GB of mem, 8 CPUs > - changed the mac > > When i then click begin install, i get the following message > > Unable to complete install: 'internal error: qemu unexpectedly closed the > monitor: 2022-06-15T17:46:09.436519Z qemu-kvm: -device > {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1- > format","id":"virtio-disk0","bootindex":1}: Failed to get "write" lock > Is another process using the image > [/var/lib/libvirt/images/QCOW10GB/D75.img]?' Could you please check whether there is another process is using /var/lib/libvirt/images/QCOW10GB/D75.img? The CML could be like: # lsof /var/lib/libvirt/images/QCOW10GB/D75.img And # qemu-img info /var/lib/libvirt/images/QCOW10GB/D75.img Could you please help to show the results? Thanks. (In reply to Tingting Mao from comment #5) > (In reply to Robert_Schulte from comment #4) > > Example output for a new machine created with virtual machine manager. > > > > I just > > > > - added the exiting image > > - set 8GB of mem, 8 CPUs > > - changed the mac > > > > When i then click begin install, i get the following message > > > > Unable to complete install: 'internal error: qemu unexpectedly closed the > > monitor: 2022-06-15T17:46:09.436519Z qemu-kvm: -device > > {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1- > > format","id":"virtio-disk0","bootindex":1}: Failed to get "write" lock > > Is another process using the image > > [/var/lib/libvirt/images/QCOW10GB/D75.img]?' > > Could you please check whether there is another process is using > /var/lib/libvirt/images/QCOW10GB/D75.img? The CML could be like: > > # lsof /var/lib/libvirt/images/QCOW10GB/D75.img > And > # qemu-img info /var/lib/libvirt/images/QCOW10GB/D75.img > > Could you please help to show the results? > > Thanks. No problem: [root@newtwo ~]# qemu-img info /var/lib/libvirt/images/QCOW10GB/D75.img image: /var/lib/libvirt/images/QCOW10GB/D75.img file format: raw virtual size: 100 GiB (107374182400 bytes) disk size: 91.2 GiB [root@newtwo ~]# lsof /var/lib/libvirt/images/QCOW10GB/D75.img [root@newtwo ~]# Also file rights and network issues were not a problem before. NFS file locking is off on the server. [root@newtwo ~]# mount | grep QCOW xxxxxx.xxxxxxx.local:VM_storage on /var/lib/libvirt/images/QCOW10GB type nfs (rw,nosuid,nodev,noexec,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.90.2,mountvers=3,mountport=30000,mountproto=udp,local_lock=none,addr=192.168.90.2) [root@newtwo ~]# ls -altrh /var/lib/libvirt/images/QCOW10GB/D75.img -rwxrwxrwx 1 root root 100G Jun 10 13:04 /var/lib/libvirt/images/QCOW10GB/D75.img [root@newtwo ~]# ps -aux | grep libvirtd root 6034 0.2 0.0 1682844 54796 ? Ssl Jun15 2:51 /usr/sbin/libvirtd --timeout 120 root 13281 0.0 0.0 6400 2224 pts/0 S+ 10:13 0:00 grep --color=auto libvirtd [root@newtwo ~]# touch /var/lib/libvirt/images/QCOW10GB/D75.img [root@newtwo ~]# Any find with command: ps -auww|grep D75 (In reply to qing.wang from comment #7) > Any find with command: ps -auww|grep D75 No relevant ones. [root@newtwo ~]# ps -auww|grep D75 root 14320 0.0 0.0 6400 2272 pts/0 S+ 11:15 0:00 grep --color=auto D75 [root@newtwo ~]# Tried in qemu layer, but did not hit the issue. Tested with: qemu-kvm-7.0.0-4.el9 kernel-5.14.0-96.el9.x86_64 Steps: Mounted infoļ¼ # nfsstat -m /home/timao/test from localhost:/home/nfs_share Flags: rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,mountaddr=::1,mountvers=3,mountport=20048,mountproto=udp6,local_lock=none,addr=::1 Tried to bootup an existing image under the nfs mounted dir: # /usr/libexec/qemu-kvm /home/timao/test/base.qcow2 qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types for previous major releases are deprecated VNC server running on ::1:5900 Results: As above, there is no lock issues. Also tried in qemu layer:
/var/lib/libvirt/images/QCOW10GB from bigstorageone.xxxxx.local:VM_storage
Flags: rw,nosuid,nodev,noexec,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.90.2,mountvers=3,mountport=30000,mountproto=udp,local_lock=none,addr=192.168.90.2
[root@newtwo ~]# /usr/libexec/qemu-kvm /var/lib/libvirt/images/QCOW10GB/D75.img
WARNING: Image format was not specified for '/var/lib/libvirt/images/QCOW10GB/D75.img' and probing guessed raw.
Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
Specify the 'raw' format explicitly to remove the restrictions.
qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types for previous major releases are deprecated
qemu-kvm: Failed to get "write" lock
Is another process using the image [/var/lib/libvirt/images/QCOW10GB/D75.img]?
[root@newtwo ~]#
Could you please try again with a local image file, but not an image in nfs? Thanks Can you create a new file manually under the nfs folder qemu-img create -f qcow2 /var/lib/libvirt/images/test.img 10G then execute /usr/libexec/qemu-kvm /var/lib/libvirt/images/test.img Does other hosts/VMs using mentioned image ? (In reply to qing.wang from comment #13) > Does other hosts/VMs using mentioned image ? No other VMs use it.(In reply to qing.wang from comment #13) > Does other hosts/VMs using mentioned image ? (In reply to Tingting Mao from comment #11) > Could you please try again with a local image file, but not an image in nfs? > > Thanks Local file seems to work: [root@newtwo ~]# /usr/libexec/qemu-kvm /home/D75.img WARNING: Image format was not specified for '/home/D75.img' and probing guessed raw. Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted. Specify the 'raw' format explicitly to remove the restrictions. qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types for previous major releases are deprecated VNC server running on ::1:5900 (In reply to qing.wang from comment #12) > Can you create a new file manually under the nfs folder > > qemu-img create -f qcow2 /var/lib/libvirt/images/test.img 10G > > then execute > /usr/libexec/qemu-kvm /var/lib/libvirt/images/test.img This also works. [root@newtwo ~]# qemu-img create -f qcow2 /var/lib/libvirt/images/QCOW10GB/test.img 10G Formatting '/var/lib/libvirt/images/QCOW10GB/test.img', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=10737418240 lazy_refcounts=off refcount_bits=16 [root@newtwo ~]# /usr/libexec/qemu-kvm /var/lib/libvirt/images/QCOW10GB/test.img qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types for previous major releases are deprecated VNC server running on ::1:5900 [root@newtwo ~]# ls -altrh /var/lib/libvirt/images/QCOW10GB/ -rwxrwxrwx 1 root root 100G Jun 16 10:13 D75.img drwxrwxrwx 6 root root 4.0K Jun 16 17:05 . -rw-r--r-- 1 root root 193K Jun 16 17:05 test.img So question is: where/ by whom are those already existing files locked? (In reply to Robert_Schulte from comment #16) > (In reply to qing.wang from comment #12) > > Can you create a new file manually under the nfs folder > > > > qemu-img create -f qcow2 /var/lib/libvirt/images/test.img 10G > > > > then execute > > /usr/libexec/qemu-kvm /var/lib/libvirt/images/test.img > > This also works. > > [root@newtwo ~]# qemu-img create -f qcow2 > /var/lib/libvirt/images/QCOW10GB/test.img 10G > Formatting '/var/lib/libvirt/images/QCOW10GB/test.img', fmt=qcow2 > cluster_size=65536 extended_l2=off compression_type=zlib size=10737418240 > lazy_refcounts=off refcount_bits=16 > [root@newtwo ~]# /usr/libexec/qemu-kvm > /var/lib/libvirt/images/QCOW10GB/test.img > qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types > for previous major releases are deprecated > VNC server running on ::1:5900 > > [root@newtwo ~]# ls -altrh /var/lib/libvirt/images/QCOW10GB/ > -rwxrwxrwx 1 root root 100G Jun 16 10:13 D75.img > drwxrwxrwx 6 root root 4.0K Jun 16 17:05 . > -rw-r--r-- 1 root root 193K Jun 16 17:05 test.img > > So question is: where/ by whom are those already existing files locked? It's intresting that the new created image is not locked. So here I would like to confirm the question of Qing again: - As the image is on nfs server, we would like to know whether there is any other nfs client is also mounting the same nfs server folder, and there is another nfs client/host using the image but not in this client/host. (In reply to Tingting Mao from comment #17) > (In reply to Robert_Schulte from comment #16) > > (In reply to qing.wang from comment #12) > > > Can you create a new file manually under the nfs folder > > > > > > qemu-img create -f qcow2 /var/lib/libvirt/images/test.img 10G > > > > > > then execute > > > /usr/libexec/qemu-kvm /var/lib/libvirt/images/test.img > > > > This also works. > > > > [root@newtwo ~]# qemu-img create -f qcow2 > > /var/lib/libvirt/images/QCOW10GB/test.img 10G > > Formatting '/var/lib/libvirt/images/QCOW10GB/test.img', fmt=qcow2 > > cluster_size=65536 extended_l2=off compression_type=zlib size=10737418240 > > lazy_refcounts=off refcount_bits=16 > > [root@newtwo ~]# /usr/libexec/qemu-kvm > > /var/lib/libvirt/images/QCOW10GB/test.img > > qemu-kvm: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine types > > for previous major releases are deprecated > > VNC server running on ::1:5900 > > > > [root@newtwo ~]# ls -altrh /var/lib/libvirt/images/QCOW10GB/ > > -rwxrwxrwx 1 root root 100G Jun 16 10:13 D75.img > > drwxrwxrwx 6 root root 4.0K Jun 16 17:05 . > > -rw-r--r-- 1 root root 193K Jun 16 17:05 test.img > > > > So question is: where/ by whom are those already existing files locked? > > It's intresting that the new created image is not locked. So here I would > like to confirm the question of Qing again: > - As the image is on nfs server, we would like to know whether there is > any other nfs client is also mounting the same nfs server folder, and there > is another nfs client/host using the image but not in this client/host. There is another client mounting the directory, but the image for sure is only in use by one server. How/where are those locks set in standard? The directories defined in default configuration either don't exist or are empty. Initially you reported that this problem occurred after an update, while now it looks like we're looking for a problem in the environment. You reported the package versions in use after the update. Can you please also provide the old package versions where it was working? Can you try selectively downgrading the qemu-kvm (and qemu-img, if a dependency requires it) package to the old version while leaving everything else as it is and check if the problem indeed disappears? If we can find two package versions that are relatively close to each other where the older one works and the newer doesn't, looking at the changes made between them could be helpful. At the moment, I don't see any suspicious change in qemu-kvm-7.0.0-4.el9 compared to previous 7.0.0 packages. hi,kevin,there is strange point is the qemu-img info can not detect file locked. But the qemu may detect file locked ? What is the difference ? Hi Robert_Schulte, 1.Like you said ,this file just used by one client. Could you please try restart nfs server and then try on client: /usr/libexec/qemu-kvm /var/lib/libvirt/images/QCOW10GB/D75.img. 2. Does all nfs relevant images have issue or just D75.img after update. 3. Does it has issue on local images after update? 4. does the VM still in running when execute you mention "update" ? 5. Any found on nfs server ? (In reply to qing.wang from comment #21) > hi,kevin,there is strange point is the qemu-img info can not detect file > locked. > But the qemu may detect file locked ? > What is the difference ? I don't know your exact command line for qemu-kvm, but I suspect that the difference might be that 'qemu-img info' uses the image file read-only whereas your qemu-kvm command line uses it read-write. Two processes can access the same image file read-only, but read-write has to be exclusive access. (In reply to Kevin Wolf from comment #20) > Initially you reported that this problem occurred after an update, while now > it looks like we're looking for a problem in the environment. > > You reported the package versions in use after the update. Can you please > also provide the old package versions where it was working? Can you try > selectively downgrading the qemu-kvm (and qemu-img, if a dependency requires > it) package to the old version while leaving everything else as it is and > check if the problem indeed disappears? > > If we can find two package versions that are relatively close to each other > where the older one works and the newer doesn't, looking at the changes made > between them could be helpful. At the moment, I don't see any suspicious > change in qemu-kvm-7.0.0-4.el9 compared to previous 7.0.0 packages. Sorry for the delay. Due to technical problems i can not access our systems atm, but as soon as the problems are solved i will check and report here. Can you tell me where i can explicitly look for any traces of a lock? As so far i never found any lock file in any of the lockd / virtlockd directories. (In reply to Robert_Schulte from comment #24) > Can you tell me where i can explicitly look for any traces of a lock? As so > far i never found any lock file in any of the lockd / virtlockd directories. These locks aren't using lock files but OFD (Open File Description) locks on the image file. In theory, lslocks should display them (though it can't know the process that locked the file), but it doesn't even display the file name for me when I just tested it. I can see the locks in /proc/$PID/fdinfo/$IMAGE_FD if you already know which process to inspect. One more question, What is the nfs server ? What is nfs export options ? (In reply to Kevin Wolf from comment #25) > (In reply to Robert_Schulte from comment #24) > > Can you tell me where i can explicitly look for any traces of a lock? As so > > far i never found any lock file in any of the lockd / virtlockd directories. > > These locks aren't using lock files but OFD (Open File Description) locks on > the image file. In theory, lslocks should display them (though it can't know > the process that locked the file), but it doesn't even display the file name > for me when I just tested it. > > I can see the locks in /proc/$PID/fdinfo/$IMAGE_FD if you already know which > process to inspect. no process found with following command lsof /var/lib/libvirt/images/QCOW10GB/D75.img and ps -auww|grep D75 Any suggestion to find related process ? (In reply to Robert_Schulte from comment #24) > Sorry for the delay. Due to technical problems i can not access our systems > atm, but as soon as the problems are solved i will check and report here. > Do you have any chances to access the systems? If no, I'd like to downgrade the severity for our tracking, thanks. > Can you tell me where i can explicitly look for any traces of a lock? As so > far i never found any lock file in any of the lockd / virtlockd directories. (In reply to Kevin Wolf from comment #20) > Initially you reported that this problem occurred after an update, while now > it looks like we're looking for a problem in the environment. > > You reported the package versions in use after the update. Can you please > also provide the old package versions where it was working? Can you try > selectively downgrading the qemu-kvm (and qemu-img, if a dependency requires > it) package to the old version while leaving everything else as it is and > check if the problem indeed disappears? > > If we can find two package versions that are relatively close to each other > where the older one works and the newer doesn't, looking at the changes made > between them could be helpful. At the moment, I don't see any suspicious > change in qemu-kvm-7.0.0-4.el9 compared to previous 7.0.0 packages. Finally i could continue on this topic. I asked for a restart of the NFS server and the problem is gone. I now also can see the locks. I will close ticket. virtlockd 89020 POSIX 100G WRITE 0 0 0 /var/lib/libvirt/images/QCOW10GB/D75.img virtlockd 89020 POSIX 8G WRITE 0 0 0 /var/lib/libvirt/images/QCOW10GB/D75_swap.img qemu-kvm 92090 POSIX 5B WRITE 0 0 0 /run/libvirt/qemu/D75.pid qemu-kvm 92090 POSIX 100G READ 0 100 101 /var/lib/libvirt/images/QCOW10GB/D75.img qemu-kvm 92090 POSIX 100G READ 0 201 201 /var/lib/libvirt/images/QCOW10GB/D75.img qemu-kvm 92090 POSIX 8G READ 0 100 101 /var/lib/libvirt/images/QCOW10GB/D75_swap.img qemu-kvm 92090 POSIX 8G READ 0 201 201 /var/lib/libvirt/images/QCOW10GB/D75_swap.img (In reply to CongLi from comment #28) > (In reply to Robert_Schulte from comment #24) > > Sorry for the delay. Due to technical problems i can not access our systems > > atm, but as soon as the problems are solved i will check and report here. > > > > Do you have any chances to access the systems? > If no, I'd like to downgrade the severity for our tracking, thanks. > > > Can you tell me where i can explicitly look for any traces of a lock? As so > > far i never found any lock file in any of the lockd / virtlockd directories. Finally i could continue on this topic. I asked for a restart of the NFS server and the problem is gone. I now also can see the locks. I will close ticket. virtlockd 89020 POSIX 100G WRITE 0 0 0 /var/lib/libvirt/images/QCOW10GB/D75.img virtlockd 89020 POSIX 8G WRITE 0 0 0 /var/lib/libvirt/images/QCOW10GB/D75_swap.img qemu-kvm 92090 POSIX 5B WRITE 0 0 0 /run/libvirt/qemu/D75.pid qemu-kvm 92090 POSIX 100G READ 0 100 101 /var/lib/libvirt/images/QCOW10GB/D75.img qemu-kvm 92090 POSIX 100G READ 0 201 201 /var/lib/libvirt/images/QCOW10GB/D75.img qemu-kvm 92090 POSIX 8G READ 0 100 101 /var/lib/libvirt/images/QCOW10GB/D75_swap.img qemu-kvm 92090 POSIX 8G READ 0 201 201 /var/lib/libvirt/images/QCOW10GB/D75_swap.img Close this bug accordingly, please reopen if needed. THanks. |
Description of problem: Since the latest update, VM can't be started - neither with virsh nor virtual machine manager. Version-Release number of selected component (if applicable): qemu-img.x86_64 17:7.0.0-4.el9 @appstream qemu-kvm.x86_64 17:7.0.0-4.el9 @appstream libvirt.x86_64 8.3.0-1.el9 @appstream libvirt-client.x86_64 8.3.0-1.el9 @appstream libvirt-daemon.x86_64 8.3.0-1.el9 @appstream How reproducible: Start an existing VM via virsh start or virtual machine manager. Steps to Reproduce: 1. virsh start xy with xy = name of VM Actual results: libvirtd[6034]: internal error: qemu unexpectedly closed the monitor: 2022-06-15T17:35:47.879425Z qemu-kvm: -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}: Failed to get "write" lock Is another process using the image [/var/lib/libvirt/images/....]? Expected results: VM should start just like before last "yum update" Additional info: When setting the images to "shareable" and "read only", the machines can be started (but are useless and fail when booting).