Bug 1687176
| Summary: | virDomainBlockJobAbort with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT get stuck when source chain includes NBD layer | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Nir Soffer <nsoffer> | ||||||||||||||||
| Component: | qemu-kvm-rhev | Assignee: | Eric Blake <eblake> | ||||||||||||||||
| Status: | CLOSED DEFERRED | QA Contact: | aihua liang <aliang> | ||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||
| Priority: | unspecified | ||||||||||||||||||
| Version: | 7.6 | CC: | coli, eblake, jinzhao, juzhang, kwolf, nsoffer, pkrempa, virt-maint | ||||||||||||||||
| Target Milestone: | rc | ||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||
| Last Closed: | 2019-07-22 20:31:00 UTC | Type: | Bug | ||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
| Embargoed: | |||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
RHV does not use disks with NBD layer yet, but we plan to use this soon for migrating VMs from external systems to RHV. What qemu version are you using? Is there anything of interest in the VM log file? Libvirt does not do anything differently when a NBD disk is in the chain so this is most probably a qemu problem. Please add the qemu version so it can be moved appropriately. (In reply to Peter Krempa from comment #3) > What qemu version are you using? Tested on: CentOS 7.6: qemu-kvm-ev-2.12.0-18.el7_6.3.1.x86_64 Fedora 28: qemu-kvm-3.1.0-4.fc28.1.x86_64 Created attachment 1542904 [details]
qemu log file from centos 7.6
Created attachment 1542905 [details]
qemu log file from fedora 28
Created attachment 1542922 [details]
libvirtd.log from fedora 28 system
The vm is called external-disk-nbd, started on:
2019-03-11 15:18:38.142+0000
Created attachment 1542924 [details]
vdsm log from fedora 28 system
Created attachment 1542938 [details]
qemu backtrace stuck vm
Looking in attachment 1542938 [details], looks like qemu monitor thread is stuck,
trying to close the nbd connection.
Thread 1 (Thread 0x7fb775a38f40 (LWP 20610)):
#0 0x00007fb76eb02526 in ppoll () from target:/lib64/libc.so.6
#1 0x000055bb79596189 in qemu_poll_ns ()
#2 0x000055bb79598234 in aio_poll ()
#3 0x000055bb79520d7a in nbd_client_close ()
#4 0x000055bb7951e64e in nbd_close ()
#5 0x000055bb794c3132 in bdrv_unref ()
#6 0x000055bb794c315f in bdrv_unref ()
#7 0x000055bb794c3075 in bdrv_set_backing_hd ()
#8 0x000055bb794c314b in bdrv_unref ()
#9 0x000055bb794c3075 in bdrv_set_backing_hd ()
#10 0x000055bb794c314b in bdrv_unref ()
#11 0x000055bb7950fd05 in mirror_exit_common ()
#12 0x000055bb794c88e9 in job_do_finalize ()
#13 0x000055bb794c8b94 in job_exit ()
#14 0x000055bb79594bf6 in aio_bh_poll ()
#15 0x000055bb79597f24 in aio_dispatch ()
#16 0x000055bb79594ad2 in aio_ctx_dispatch ()
#17 0x00007fb774ccb91d in g_main_context_dispatch () from target:/lib64/libglib-2.0.so.0
#18 0x000055bb795971a8 in main_loop_wait ()
#19 0x000055bb792f7fa9 in main_loop ()
#20 0x000055bb7919a8cc in main ()
The nbd server is qemu-nbd running on CentOS 7.6 host:
qemu-img-ev-2.12.0-18.el7_6.3.1.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ff1f138b2cf in ppoll () from /lib64/libc.so.6
Thread 1 (Thread 0x7ff208457b80 (LWP 21013)):
#0 0x00007ff1f138b2cf in ppoll () from /lib64/libc.so.6
#1 0x0000560934b99e7b in qemu_poll_ns ()
#2 0x0000560934b9ad3e in main_loop_wait ()
#3 0x0000560934ad78bc in main ()
Eric, this seems like qemu issue.
Kevin, is this related to any of the bdrv_drain_all bugs you've been trying to plug? Created attachment 1542940 [details]
vm xml reproducing this issue
Created attachment 1542953 [details]
backtrace after shutting down the vm from oVirt
I can reproduce the same issue by: 1. start the a VM with NBD layer (see attachment 1542940 [details]) 2. shut down the VM in oVirt 3. The VM get stuck blocked on: Thread 1 (Thread 0x7f8a341ebf40 (LWP 9729)): #0 0x00007f8a2d2b5526 in ppoll () from target:/lib64/libc.so.6 #1 0x000055b79463f189 in qemu_poll_ns () #2 0x000055b794641234 in aio_poll () #3 0x000055b7945c9d7a in nbd_client_close () #4 0x000055b7945c764e in nbd_close () #5 0x000055b79456c132 in bdrv_unref () #6 0x000055b79456c15f in bdrv_unref () #7 0x000055b79456c075 in bdrv_set_backing_hd () #8 0x000055b79456c14b in bdrv_unref () #9 0x000055b7945b0605 in blk_remove_bs () #10 0x000055b7945b0664 in blk_remove_all_bs () #11 0x000055b7945691a3 in bdrv_close_all () #12 0x000055b7942438e0 in main () See attachment 1542953 [details] for full backtrace. (In reply to Eric Blake from comment #11) > Kevin, is this related to any of the bdrv_drain_all bugs you've been trying > to plug? I would hope so, but the backtrace is just what any hang looks like typically, so I can't tell for sure. I guess it's a matter of testing against upstream. I suspect that maybe even just commit 4720cbee is enough, which just adds an aio_wait_kick(). Reproduced with qemu master: commit 2e35c7738e16452a95aec1fbef5e8b6f38a2d647 (HEAD -> master, origin/master, origin/HEAD) Merge: 79d8b1dc5b ff243cff11 Author: Peter Maydell <peter.maydell> Date: Mon Mar 11 13:45:37 2019 +0000 Same backtrace when shutting down a vm. When qemu built with --enable-debug, I cannot reproduce.
I can start and step the same in attachment 1542940 [details] without
any issue.
So its either debug mode, or I was not running the master qemu when
I reproduced this before.
When build with --enable-debug, libvirt failed to start qemu because of
bad selinux labels on /usr/local/bin/qemu-system-*. So I suspect that
libvirt was using /usr/bin/qemu-system-*.
Same issue with bug's comment4: https://bugzilla.redhat.com/show_bug.cgi?id=1655367#c4 It's a block-mirror get stuck issue. |
Description of problem: When completing live storage migration in RHV, if the source image backing file is NBD (e.g nbd://server:port), the block job get stuck in "pivot" status. Here is the flow. It should be possible to reproduce with plain libvirt and virsh, but I did not try. 1. Preparing a disk Run qemu-nbd on some host, serving the image: $ qemu-nbd \ --read-only \ --port 9000 \ --persistent \ --export-name '' \ -f raw \ /var/tmp/fedora-29.raw Open the port on the server: # firewall-cmd --add-port 9000/tcp Attach the NBD backing file to a block based disk: $ qemu-img rebase -u -b nbd://voodoo7.tlv.redhat.com:9000 -F raw /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc Testing that disk with the NBD backing file: $ qemu-img info --backing-chain /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc image: /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc file format: qcow2 virtual size: 10G (10737418240 bytes) disk size: 0 cluster_size: 65536 backing file: nbd://voodoo7.tlv.redhat.com:9000 backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false image: nbd://voodoo7.tlv.redhat.com:9000 file format: raw virtual size: 6.0G (6442450944 bytes) disk size: unavailable 2. Starting VM We have one qcow2 disk with NBD backing file. <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='network' index='1'> <format type='raw'/> <source protocol='nbd'> <host name='voodoo7.tlv.redhat.com' port='9000'/> </source> </backingStore> <target dev='sda' bus='scsi'/> <serial>637a4996-28cd-4587-846b-197d69185316</serial> <boot order='1'/> <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> 3. Creating snapshot <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/> <backingStore type='network' index='2'> <format type='raw'/> <source protocol='nbd'> <host name='voodoo7.tlv.redhat.com' port='9000'/> </source> </backingStore> </backingStore> <target dev='sda' bus='scsi'/> <serial>637a4996-28cd-4587-846b-197d69185316</serial> <boot order='1'/> <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> 4. Starting block copy job: Vdsm calls: destxml = xmlutils.tostring(drive.getReplicaXML()) flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW | libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT | libvirt.VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB) self._dom.blockCopy(drive.name, destxml, flags=flags) The domain xml shortly after this call: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/> <backingStore type='network' index='2'> <format type='raw'/> <source protocol='nbd'> <host name='voodoo7.tlv.redhat.com' port='9000'/> </source> </backingStore> </backingStore> <mirror type='block' job='copy' ready='yes'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/1c879721-85e1-4401-bb7c-05f46cf8359f/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'> <seclabel model='dac' relabel='no'/> </source> </mirror> <target dev='sda' bus='scsi'/> <serial>637a4996-28cd-4587-846b-197d69185316</serial> <boot order='1'/> <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> While libvirt is mirroring the active layer, vdsm copy the second layer to the destination disk using qemu-img: qemu-img convert -f qcow2 -O qcow2 -t none -T none \ /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc \ /dev/1c879721-85e1-4401-bb7c-05f46cf8359f/c4261a7e-0af9-4be2-8137-af93a722effc 5. Aborting the job with the PIVOT flag Vdsm completed copying the second layer to the destination, and libvirt block copy job is ready. Vdsm calls: flags = libvirt.VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT self._dom.blockJobAbort(drive.name, flags) The domain xml changes to: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/> <backingStore type='network' index='2'> <format type='raw'/> <source protocol='nbd'> <host name='voodoo7.tlv.redhat.com' port='9000'/> </source> </backingStore> </backingStore> <mirror type='block' job='copy' ready='pivot'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/1c879721-85e1-4401-bb7c-05f46cf8359f/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'> <seclabel model='dac' relabel='no'/> </source> </mirror> <target dev='sda' bus='scsi'/> <serial>637a4996-28cd-4587-846b-197d69185316</serial> <boot order='1'/> <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> 6. libvirt is stuck in pivot state. Vdsm logs this every minute: 2019-03-10 03:29:01,833+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: ... traceback: ... File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4683, in diskReplicateFinish self._dom.blockJobAbort(drive.name, blockJobFlags) File: "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in f ret = attr(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 792, in blockJobAbort ret = libvirtmod.virDomainBlockJobAbort(self._o, disk, flags) (executor:363) After 3 minutes, engine try to abort the job: 2019-03-10 03:31:03,232+0200 DEBUG (jsonrpc/3) [api] START diskReplicateFinish args=(<vdsm.API.VM object at 0x7f85ac046950>, {'device': 'disk', 'poolID': '570f8d15-ece6-4ed9 -87a7-84b356edb548', 'volumeID': '0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1', 'domainID': '738bb7b1-87d8-4ee6-8262-f733793f7a02', 'imageID': '637a4996-28cd-4587-846b-197d69185316 '}, {'device': 'disk', 'poolID': '570f8d15-ece6-4ed9-87a7-84b356edb548', 'volumeID': '0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1', 'domainID': '738bb7b1-87d8-4ee6-8262-f733793f7a0 2', 'imageID': '637a4996-28cd-4587-846b-197d69185316'}) kwargs={} (api:122) This call get stuck for 30 seconds and then fail with: 2019-03-10 03:31:33,232+0200 ERROR (jsonrpc/3) [api] FINISH diskReplicateFinish error=Timed out during operation: cannot acquire state change lock (held by monitor=remoteDis patchDomainBlockJobAbort) (api:134) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 573, in diskReplicateFinish return self.vm.diskReplicateFinish(srcDisk, dstDisk) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4632, in diskReplicateFinish blkJobInfo = self._dom.blockJobInfo(drive.name, 0) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 108, in f raise toe TimeoutError: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainBlockJobAbort) The only way to recover is to stop the VM. Version-Release number of selected component (if applicable): Reproduce both on CentOS 7.6 and Fedora 28 using virt-preview repo. CentOS 7.6: libvirt-python-4.5.0-1.el7.x86_64 libvirt-daemon-4.5.0-10.el7_6.4.x86_64 Fedora 28: libvirt-python built from source with fix for bug 1687114 libvirt-daemon-5.1.0-2.fc28.x86_64 How reproducible: Always. libvirt logs are disabled by default on RHV, so I don't have libvirt log. I will provide logs if needed.