Bug 1856244 - libvirt.libvirtError: internal error: qemu block name 'json: ... when blockCommit() was called for snapshot file in OpenStack
Summary: libvirt.libvirtError: internal error: qemu block name 'json: ... when blockCo...
Keywords:
Status: CLOSED DUPLICATE of bug 760547
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 8.3
Assignee: Virtualization Maintenance
QA Contact: yisun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-13 07:25 UTC by Masayuki Igawa
Modified: 2023-12-15 18:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-15 15:36:54 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Masayuki Igawa 2020-07-13 07:25:24 UTC
Description of problem:
When a volume snapshot which is managed through libvirt in OpenStack is deleted, the file backend is committed with libvirt.
However, an internal error of libvirt occurs with unmatching the value format like below.

~~~
2020-06-30 01:10:55.953 7 DEBUG nova.virt.libvirt.driver will call blockCommit with commit_disk=vdb commit_base=volume-<Vol-UUID> commit_top=volume-<Vol-UUID>.<snapshot-UUID>  _volume_snapshot_delete /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:3010
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver Error occurred during volume_snapshot_delete, sending error status to Cinder.: libvirt.libvirtError: internal error: qemu block name 'json:{"backing": {"driver": "raw", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>"}}' doesn't match expected '/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>'
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>] Traceback (most recent call last):
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 3028, in volume_snapshot_delete
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     snapshot_id, delete_info=delete_info)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 3013, in _volume_snapshot_delete
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     result = dev.commit(commit_base, commit_top, relative=True)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 805, in commit
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     self._disk, base, top, self.COMMIT_DEFAULT_BANDWIDTH, flags=flags)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     rv = execute(f, *args, **kwargs)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     six.reraise(c, e, tb)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     raise value
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     rv = meth(*args, **kwargs)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 728, in blockCommit
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>]     if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
2020-06-30 01:10:55.979 7 ERROR nova.virt.libvirt.driver [instance: <UUID>] libvirt.libvirtError: internal error: qemu block name 'json:{"backing": {"driver": "raw", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>"}}' doesn't match expected '/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>'
~~~


Version-Release number of selected component (if applicable):
 * libvirt-daemon-5.6.0-10

How reproducible:
100%

Steps to Reproduce:
1. Setup OpenStack with NFS backend for instances
2. Run OpenStack Tempest test - tempest.api.test_volumes_snapshots.VolumesSnapshotTestJSON.test_snapshot_create_delete_with_volume_in_use
   -> https://opendev.org/openstack/tempest/src/tag/23.0.0/tempest/api/volume/test_volumes_snapshots.py#L40-L63

Actual results:
An error occurs.
~~~
libvirt.libvirtError: internal error: qemu block name 'json:{"backing": {"driver": "raw", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>"}}' doesn't match expected '/var/lib/nova/mnt/<UUID>/volume-<Vol-UUID>.<snapshot-UUID>'
~~~

Expected results:
No error occurs.

Additional info:
 * Red Hat OpenStack 16.0 (RHEL8.1)
   - NFS backend for instances
 * qemu-kvm-core-4.1.0-23

Comment 1 Masayuki Igawa 2020-07-13 07:41:11 UTC
I think this issue is similar to BZ#1785939 because the error message and libvirt line 728 on the bug is the same of this.
~~~
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5542, in merge
    bandwidth, flags)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 728, in blockCommit
    if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
libvirt.libvirtError: internal error: qemu block name 'json:{"backing": {"backing": {"driver": "qcow2", "file": {"driver": "file", "filename": "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge7__nfs__0/98c55ff6-4857-4b5e-8e56-31cb3707d5d8/images/32c49b05-90f2-4289-b15e-521e1c81fa0f/709ce457-defb-479e-ba50-43a73dfecab6"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge7__nfs__0/98c55ff6-4857-4b5e-8e56-31cb3707d5d8/images/32c49b05-90f2-4289-b15e-521e1c81fa0f/19fd5cc1-3128-45f5-903e-3f6a9859b98d"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge7__nfs__0/98c55ff6-4857-4b5e-8e56-31cb3707d5d8/images/32c49b05-90f2-4289-b15e-521e1c81fa0f/fd40794a-dbd8-43d0-9357-9c68bb586870"}}' doesn't match expected '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge7__nfs__0/98c55ff6-4857-4b
~~~

So, I guess it's already fixed in version 8.2 according to the comment BZ#1785939#c33.
Is my understanding correct?

Comment 2 Peter Krempa 2020-07-13 13:12:13 UTC
Yes, this should be fixed by libvirt switching to -blockdev starting from rhel-av-8.2.

Should we move this bug to openstack for testing it? Otherwise I'll close it as a duplicate of the bug enabling -blockdev in libvirt.

Comment 3 Takashi Kajinami 2020-07-13 23:28:33 UTC
Hi Peter,

Thank you for the confirmation.
May I ask some questions to understand the fix correctly ?

(1)
According to your update in the previous bz[1], the issue should be fixed in
qemu-4.2 and libvirt-5.10. Is it correct ?

https://bugzilla.redhat.com/show_bug.cgi?id=1785939#c7

(2)
Definitely we can move this bug (or create a cloned one and assign it to RHOSP)
as test-only patch.

However the problem is that RHOSP16.0, which should be supported until October 27, 2020,
is based on RHEL8.1, which doesn't have fix yet.
 https://access.redhat.com/support/policy/updates/openstack/platform
Should we make z-stream request in the previous bz ?
Or can we treat this as a fix request for RHEL7.1 ?
Honestly I'm concerned that we can't backport the same fix into RHEL7.1 virt
because the fix sounds like a huge change in architecture.

Comment 4 Peter Krempa 2020-07-14 04:13:28 UTC
(In reply to Takashi Kajinami from comment #3)
> Hi Peter,
> 
> Thank you for the confirmation.
> May I ask some questions to understand the fix correctly ?
> 
> (1)
> According to your update in the previous bz[1], the issue should be fixed in
> qemu-4.2 and libvirt-5.10. Is it correct ?

Yes, that is correct. rhel-av-8.2 uses libvirt-6.0

> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1785939#c7
> 
> (2)
> Definitely we can move this bug (or create a cloned one and assign it to
> RHOSP)
> as test-only patch.
> 
> However the problem is that RHOSP16.0, which should be supported until
> October 27, 2020,
> is based on RHEL8.1, which doesn't have fix yet.
>  https://access.redhat.com/support/policy/updates/openstack/platform
> Should we make z-stream request in the previous bz ?
> Or can we treat this as a fix request for RHEL7.1 ?
> Honestly I'm concerned that we can't backport the same fix into RHEL7.1 virt
> because the fix sounds like a huge change in architecture.

No, we can't without a rebase. The changeset that switched over to -blockdev is plainly too huge. Without that we can't really control the blockjobs any better.

Comment 5 Peter Krempa 2020-07-14 07:58:00 UTC
I think that you meant RHEL-8.1 in the paragraph above.

Either way. I can only suggest upgrading to rhel-av-8.2. Both libvirt and qemu. There shouldn't be any semantic changes requiring changes in openstack to be able to use new libvirt and qemu.

I'm not sure how openstack handles such things though.

Either way, it's not possible to backport this to rhel-av-8.1. The 'backport' would basically be a rebase to rhel-av-8.2 anyways.

Comment 6 Takashi Kajinami 2020-07-15 07:13:37 UTC
> I think that you meant RHEL-8.1 in the paragraph above.
Yes. Sorry for making confusion.

The problem is that, as I mentioned, RHOSP16.0 only supports RHEL8.1,
and RHEL8.2 is supported since upcoming RHOSP16.1 .

This means that this snapshot feature is completely broken in RHOSP16.0 because of this bug in libvirt,
though this 16.1 will be supported until October, with using RHEL 8.1 EUS as its base OS.

So the ideal solution here would be implement fix in rhel-av-8.1 (which is not a backport but a kind of
old version only fix) but I don't know this is actually acceptable from RHEL/Virt team.

For you reference (and my personal note), the following shows the list of qemu/libvirt packages
installed in the lastest libvirt container image we use in RHOSP16.0

https://catalog.redhat.com/software/containers/rhosp-rhel8/openstack-nova-libvirt/5de6c2ddbed8bd164a0c1bbf?tag=16.0-108&architecture=amd64&container-tabs=overview

libvirt-bash-completion-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-client-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-config-nwfilter-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-interface-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-network-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-nodedev-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-nwfilter-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-qemu-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-secret-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-core-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-disk-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-gluster-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-iscsi-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-iscsi-direct-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-logical-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-mpath-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-rbd-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-driver-storage-scsi-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-daemon-kvm-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
libvirt-libs-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
qemu-img-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-block-curl-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-block-gluster-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-block-iscsi-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-block-rbd-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-block-ssh-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-common-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64
qemu-kvm-core-4.1.0-23.module+el8.1.1+6238+f5d69f68.3.x86_64

Comment 7 Peter Krempa 2020-07-15 07:45:21 UTC
(In reply to Takashi Kajinami from comment #6)
> > I think that you meant RHEL-8.1 in the paragraph above.
> Yes. Sorry for making confusion.
> 
> The problem is that, as I mentioned, RHOSP16.0 only supports RHEL8.1,
> and RHEL8.2 is supported since upcoming RHOSP16.1 .
> 
> This means that this snapshot feature is completely broken in RHOSP16.0
> because of this bug in libvirt,
> though this 16.1 will be supported until October, with using RHEL 8.1 EUS as
> its base OS.
> 
> So the ideal solution here would be implement fix in rhel-av-8.1 (which is
> not a backport but a kind of
> old version only fix) but I don't know this is actually acceptable from
> RHEL/Virt team.

So this would be a downstream-only throwaway fix for just rhel-av-8.1, which I don't think will even have another Z-stream release. This really needs to go through PMs first then to approve spending time for a throwaway fix.

To be clear, I don't really know the extent of what's required changing. The problem is in the cooperation between qemu and libvirt here actually. Qemu for some reason converts the filenames to JSON strings in this instance. This then throws off libvirt as the old code wans't ready for that. Proper fixing will require also figuring out when and why that happens in qemu so that we can be sure.

As said, any of that code is no longer invoked in av-8.2 and upstream.

My suggestion is still just to use qemu/libvirt which uses blockdev. The idea of libvirt is in fact to shield from any differences of the underlying hypervisor and provide a stable compatibility layer. It should be utilized as such.

Please raise the issue thorugh appropriate management if the fix really must be based on av-8.1 as I'd consider it a waste of time and effort.


Note You need to log in before you can comment on or make changes to this bug.