Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1687176

Summary: virDomainBlockJobAbort with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT get stuck when source chain includes NBD layer
Product: Red Hat Enterprise Linux 7 Reporter: Nir Soffer <nsoffer>
Component: qemu-kvm-rhevAssignee: Eric Blake <eblake>
Status: CLOSED DEFERRED QA Contact: aihua liang <aliang>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.6CC: coli, eblake, jinzhao, juzhang, kwolf, nsoffer, pkrempa, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-22 20:31:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
qemu log file from centos 7.6
none
qemu log file from fedora 28
none
libvirtd.log from fedora 28 system
none
vdsm log from fedora 28 system
none
qemu backtrace stuck vm
none
vm xml reproducing this issue
none
backtrace after shutting down the vm from oVirt none

Description Nir Soffer 2019-03-10 17:26:34 UTC
Description of problem:

When completing live storage migration in RHV, if the source image backing file
is NBD (e.g nbd://server:port), the block job get stuck in "pivot" status.

Here is the flow. It should be possible to reproduce with plain libvirt
and virsh, but I did not try.


1. Preparing a disk

Run qemu-nbd on some host, serving the image:

    $ qemu-nbd \
        --read-only \
        --port 9000 \
        --persistent \
        --export-name '' \
        -f raw \
        /var/tmp/fedora-29.raw

Open the port on the server:

    # firewall-cmd --add-port 9000/tcp

Attach the NBD backing file to a block based disk:

    $ qemu-img rebase -u -b nbd://voodoo7.tlv.redhat.com:9000 -F raw /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc

Testing that disk with the NBD backing file:

    $ qemu-img info --backing-chain /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc
    image: /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc
    file format: qcow2
    virtual size: 10G (10737418240 bytes)
    disk size: 0
    cluster_size: 65536
    backing file: nbd://voodoo7.tlv.redhat.com:9000
    backing file format: raw
    Format specific information:
        compat: 1.1
        lazy refcounts: false
        refcount bits: 16
        corrupt: false

    image: nbd://voodoo7.tlv.redhat.com:9000
    file format: raw
    virtual size: 6.0G (6442450944 bytes)
    disk size: unavailable


2. Starting VM

We have one qcow2 disk with NBD backing file.

    <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore type='network' index='1'>
        <format type='raw'/>
        <source protocol='nbd'>
          <host name='voodoo7.tlv.redhat.com' port='9000'/>
        </source>
      </backingStore>
      <target dev='sda' bus='scsi'/>
      <serial>637a4996-28cd-4587-846b-197d69185316</serial>
      <boot order='1'/>
      <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>


3. Creating snapshot

    <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore type='block' index='1'>
        <format type='qcow2'/>
        <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/>
        <backingStore type='network' index='2'>
          <format type='raw'/>
          <source protocol='nbd'>
            <host name='voodoo7.tlv.redhat.com' port='9000'/>
          </source>
        </backingStore>
      </backingStore>
      <target dev='sda' bus='scsi'/>
      <serial>637a4996-28cd-4587-846b-197d69185316</serial>
      <boot order='1'/>
      <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>


4. Starting block copy job:

Vdsm calls:

    destxml = xmlutils.tostring(drive.getReplicaXML())

    flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW |
             libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT |
             libvirt.VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB)

    self._dom.blockCopy(drive.name, destxml, flags=flags)

The domain xml shortly after this call:

    <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore type='block' index='1'>
        <format type='qcow2'/>
        <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/>
        <backingStore type='network' index='2'>
          <format type='raw'/>
          <source protocol='nbd'>
            <host name='voodoo7.tlv.redhat.com' port='9000'/>
          </source>
        </backingStore>
      </backingStore>
      <mirror type='block' job='copy' ready='yes'>
        <format type='qcow2'/>
        <source dev='/rhev/data-center/mnt/blockSD/1c879721-85e1-4401-bb7c-05f46cf8359f/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'>
          <seclabel model='dac' relabel='no'/>
        </source>
      </mirror>
      <target dev='sda' bus='scsi'/>
      <serial>637a4996-28cd-4587-846b-197d69185316</serial>
      <boot order='1'/>
      <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

While libvirt is mirroring the active layer, vdsm copy the second layer to 
the destination disk using qemu-img:

    qemu-img convert -f qcow2 -O qcow2 -t none -T none \
        /dev/738bb7b1-87d8-4ee6-8262-f733793f7a02/c4261a7e-0af9-4be2-8137-af93a722effc \
        /dev/1c879721-85e1-4401-bb7c-05f46cf8359f/c4261a7e-0af9-4be2-8137-af93a722effc


5. Aborting the job with the PIVOT flag

Vdsm completed copying the second layer to the destination, and libvirt
block copy job is ready.

Vdsm calls:

    flags = libvirt.VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT

    self._dom.blockJobAbort(drive.name, flags)

The domain xml changes to:

    <disk type='block' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore type='block' index='1'>
        <format type='qcow2'/>
        <source dev='/rhev/data-center/mnt/blockSD/738bb7b1-87d8-4ee6-8262-f733793f7a02/images/637a4996-28cd-4587-846b-197d69185316/c4261a7e-0af9-4be2-8137-af93a722effc'/>
        <backingStore type='network' index='2'>
          <format type='raw'/>
          <source protocol='nbd'>
            <host name='voodoo7.tlv.redhat.com' port='9000'/>
          </source>
        </backingStore>
      </backingStore>
      <mirror type='block' job='copy' ready='pivot'>
        <format type='qcow2'/>
        <source dev='/rhev/data-center/mnt/blockSD/1c879721-85e1-4401-bb7c-05f46cf8359f/images/637a4996-28cd-4587-846b-197d69185316/0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1'>
          <seclabel model='dac' relabel='no'/>
        </source>
      </mirror>
      <target dev='sda' bus='scsi'/>
      <serial>637a4996-28cd-4587-846b-197d69185316</serial>
      <boot order='1'/>
      <alias name='ua-637a4996-28cd-4587-846b-197d69185316'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>


6. libvirt is stuck in pivot state.

Vdsm logs this every minute:

2019-03-10 03:29:01,833+0200 WARN  (vdsm.Scheduler) [Executor] Worker blocked: ... traceback:
...
File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4683, in diskReplicateFinish
  self._dom.blockJobAbort(drive.name, blockJobFlags)
File: "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in f
  ret = attr(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
  ret = f(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
  return func(inst, *args, **kwargs)
File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 792, in blockJobAbort
  ret = libvirtmod.virDomainBlockJobAbort(self._o, disk, flags) (executor:363)


After 3 minutes, engine try to abort the job:

2019-03-10 03:31:03,232+0200 DEBUG (jsonrpc/3) [api] START diskReplicateFinish args=(<vdsm.API.VM object at 0x7f85ac046950>, {'device': 'disk', 'poolID': '570f8d15-ece6-4ed9
-87a7-84b356edb548', 'volumeID': '0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1', 'domainID': '738bb7b1-87d8-4ee6-8262-f733793f7a02', 'imageID': '637a4996-28cd-4587-846b-197d69185316
'}, {'device': 'disk', 'poolID': '570f8d15-ece6-4ed9-87a7-84b356edb548', 'volumeID': '0b719960-cbcc-42c3-8d1b-7ea1cdaf0db1', 'domainID': '738bb7b1-87d8-4ee6-8262-f733793f7a0
2', 'imageID': '637a4996-28cd-4587-846b-197d69185316'}) kwargs={} (api:122)

This call get stuck for 30 seconds and then fail with:

2019-03-10 03:31:33,232+0200 ERROR (jsonrpc/3) [api] FINISH diskReplicateFinish error=Timed out during operation: cannot acquire state change lock (held by monitor=remoteDis
patchDomainBlockJobAbort) (api:134)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 573, in diskReplicateFinish
    return self.vm.diskReplicateFinish(srcDisk, dstDisk)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4632, in diskReplicateFinish
    blkJobInfo = self._dom.blockJobInfo(drive.name, 0)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 108, in f
    raise toe
TimeoutError: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainBlockJobAbort)


The only way to recover is to stop the VM.


Version-Release number of selected component (if applicable):
Reproduce both on CentOS 7.6 and Fedora 28 using virt-preview repo.

CentOS 7.6:
libvirt-python-4.5.0-1.el7.x86_64
libvirt-daemon-4.5.0-10.el7_6.4.x86_64

Fedora 28:
libvirt-python built from source with fix for bug 1687114
libvirt-daemon-5.1.0-2.fc28.x86_64

How reproducible:
Always.

libvirt logs are disabled by default on RHV, so I don't have libvirt log.

I will provide logs if needed.

Comment 2 Nir Soffer 2019-03-10 17:42:56 UTC
RHV does not use disks with NBD layer yet, but we plan to use this soon
for migrating VMs from external systems to RHV.

Comment 3 Peter Krempa 2019-03-11 07:30:49 UTC
What qemu version are you using? Is there anything of interest in the VM log file?

Libvirt does not do anything differently when a NBD disk is in the chain so this is most probably a qemu problem. Please add the qemu version so it can be moved appropriately.

Comment 4 Nir Soffer 2019-03-11 13:00:55 UTC
(In reply to Peter Krempa from comment #3)
> What qemu version are you using?

Tested on:

CentOS 7.6:
qemu-kvm-ev-2.12.0-18.el7_6.3.1.x86_64

Fedora 28:
qemu-kvm-3.1.0-4.fc28.1.x86_64

Comment 5 Nir Soffer 2019-03-11 13:44:01 UTC
Created attachment 1542904 [details]
qemu log file from centos 7.6

Comment 6 Nir Soffer 2019-03-11 13:44:40 UTC
Created attachment 1542905 [details]
qemu log file from fedora 28

Comment 7 Nir Soffer 2019-03-11 15:39:51 UTC
Created attachment 1542922 [details]
libvirtd.log from fedora 28 system

The vm is called external-disk-nbd, started on:
2019-03-11 15:18:38.142+0000

Comment 8 Nir Soffer 2019-03-11 15:41:10 UTC
Created attachment 1542924 [details]
vdsm log from fedora 28 system

Comment 9 Nir Soffer 2019-03-11 15:47:07 UTC
Created attachment 1542938 [details]
qemu backtrace stuck vm

Comment 10 Nir Soffer 2019-03-11 15:55:32 UTC
Looking in attachment 1542938 [details], looks like qemu monitor thread is stuck, 
trying to close the nbd connection.

Thread 1 (Thread 0x7fb775a38f40 (LWP 20610)):
#0  0x00007fb76eb02526 in ppoll () from target:/lib64/libc.so.6
#1  0x000055bb79596189 in qemu_poll_ns ()
#2  0x000055bb79598234 in aio_poll ()
#3  0x000055bb79520d7a in nbd_client_close ()
#4  0x000055bb7951e64e in nbd_close ()
#5  0x000055bb794c3132 in bdrv_unref ()
#6  0x000055bb794c315f in bdrv_unref ()
#7  0x000055bb794c3075 in bdrv_set_backing_hd ()
#8  0x000055bb794c314b in bdrv_unref ()
#9  0x000055bb794c3075 in bdrv_set_backing_hd ()
#10 0x000055bb794c314b in bdrv_unref ()
#11 0x000055bb7950fd05 in mirror_exit_common ()
#12 0x000055bb794c88e9 in job_do_finalize ()
#13 0x000055bb794c8b94 in job_exit ()
#14 0x000055bb79594bf6 in aio_bh_poll ()
#15 0x000055bb79597f24 in aio_dispatch ()
#16 0x000055bb79594ad2 in aio_ctx_dispatch ()
#17 0x00007fb774ccb91d in g_main_context_dispatch () from target:/lib64/libglib-2.0.so.0
#18 0x000055bb795971a8 in main_loop_wait ()
#19 0x000055bb792f7fa9 in main_loop ()
#20 0x000055bb7919a8cc in main ()


The nbd server is qemu-nbd running on CentOS 7.6 host:
qemu-img-ev-2.12.0-18.el7_6.3.1.x86_64

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ff1f138b2cf in ppoll () from /lib64/libc.so.6

Thread 1 (Thread 0x7ff208457b80 (LWP 21013)):
#0  0x00007ff1f138b2cf in ppoll () from /lib64/libc.so.6
#1  0x0000560934b99e7b in qemu_poll_ns ()
#2  0x0000560934b9ad3e in main_loop_wait ()
#3  0x0000560934ad78bc in main ()

Eric, this seems like qemu issue.

Comment 11 Eric Blake 2019-03-11 16:14:29 UTC
Kevin, is this related to any of the bdrv_drain_all bugs you've been trying to plug?

Comment 12 Nir Soffer 2019-03-11 16:45:31 UTC
Created attachment 1542940 [details]
vm xml reproducing this issue

Comment 13 Nir Soffer 2019-03-11 16:47:24 UTC
Created attachment 1542953 [details]
backtrace after shutting down the vm from oVirt

Comment 14 Nir Soffer 2019-03-11 16:50:03 UTC
I can reproduce the same issue by:
1. start the a VM with NBD layer (see attachment 1542940 [details])
2. shut down the VM in oVirt
3. The VM get stuck blocked on:

Thread 1 (Thread 0x7f8a341ebf40 (LWP 9729)):
#0  0x00007f8a2d2b5526 in ppoll () from target:/lib64/libc.so.6
#1  0x000055b79463f189 in qemu_poll_ns ()
#2  0x000055b794641234 in aio_poll ()
#3  0x000055b7945c9d7a in nbd_client_close ()
#4  0x000055b7945c764e in nbd_close ()
#5  0x000055b79456c132 in bdrv_unref ()
#6  0x000055b79456c15f in bdrv_unref ()
#7  0x000055b79456c075 in bdrv_set_backing_hd ()
#8  0x000055b79456c14b in bdrv_unref ()
#9  0x000055b7945b0605 in blk_remove_bs ()
#10 0x000055b7945b0664 in blk_remove_all_bs ()
#11 0x000055b7945691a3 in bdrv_close_all ()
#12 0x000055b7942438e0 in main ()
 
See attachment 1542953 [details] for full backtrace.

Comment 15 Kevin Wolf 2019-03-11 16:56:51 UTC
(In reply to Eric Blake from comment #11)
> Kevin, is this related to any of the bdrv_drain_all bugs you've been trying
> to plug?

I would hope so, but the backtrace is just what any hang looks like typically, so I can't tell for sure. I guess it's a matter of testing against upstream.

I suspect that maybe even just commit 4720cbee is enough, which just adds an aio_wait_kick().

Comment 16 Nir Soffer 2019-03-11 19:16:47 UTC
Reproduced with qemu master:

commit 2e35c7738e16452a95aec1fbef5e8b6f38a2d647 (HEAD -> master, origin/master, origin/HEAD)
Merge: 79d8b1dc5b ff243cff11
Author: Peter Maydell <peter.maydell>
Date:   Mon Mar 11 13:45:37 2019 +0000

Same backtrace when shutting down a vm.

Comment 17 Nir Soffer 2019-03-11 20:36:46 UTC
When qemu built with --enable-debug, I cannot reproduce.
I can start and step the same in attachment 1542940 [details] without
any issue.

So its either debug mode, or I was not running the master qemu when 
I reproduced this before.

When build with --enable-debug, libvirt failed to start qemu because of
bad selinux labels on /usr/local/bin/qemu-system-*. So I suspect that
libvirt was using /usr/bin/qemu-system-*.

Comment 18 aihua liang 2019-03-12 05:58:20 UTC
Same issue with bug's comment4:
   https://bugzilla.redhat.com/show_bug.cgi?id=1655367#c4

It's a block-mirror get stuck issue.