Bug 1300209 - Cannot take external snapshot any more after active committing & pivoting
Cannot take external snapshot any more after active committing & pivoting
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: qemu (Show other bugs)
24
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Fedora Virtualization Maintainers
Fedora Extras Quality Assurance
:
: 1300165 1302064 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-20 04:30 EST by yangyang
Modified: 2016-03-26 14:01 EDT (History)
19 users (show)

See Also:
Fixed In Version: qemu-2.5.0-10.fc24
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-26 14:01:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
libvirt log (7.96 MB, text/plain)
2016-01-20 04:32 EST, yangyang
no flags Details

  None (edit)
Description yangyang 2016-01-20 04:30:07 EST
Description of problem:
After active committing&pivoting, external snapshots cannot be taken
any more, error like:
# virsh snapshot-create-as generic hda --disk-only --no-metadata --diskspec hda,file=/tmp/generic.s1
error: internal error: unable to execute QEMU command 'transaction': The feature 'snapshot' is not enabled

Version-Release number of selected component (if applicable):
libvirt-1.3.1-1.fc24_v1.3.1_rc2.x86_64
qemu-kvm-2.5.0-3.fc24.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare a transient domain
# virsh list --transient
 Id    Name                           State
----------------------------------------------------
 12    generic                        running

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.qcow2

2. take 3 external snapshots
# virsh snapshot-create-as generic s1 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s1
Domain snapshot s1 created
# virsh console generic
dd a file in guest
# dd if=/dev/urandom of=ha bs=1024 count=102400

# virsh snapshot-create-as generic s2 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s2
Domain snapshot s2 created
dd a file in guest
# dd if=/dev/urandom of=haha bs=1024 count=102400

# virsh snapshot-create-as generic s3 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s3
Domain snapshot s3 created
dd a file in guest
# dd if=/dev/urandom of=hahaha bs=1024 count=102400

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.s3

# qemu-img info /var/lib/libvirt/images/generic.s3 --backing-chain
image: /var/lib/libvirt/images/generic.s3
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 248M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.s2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.s2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 56M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.s1
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.s1
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 132M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 2.0G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: true
    refcount bits: 16
    corrupt: false

3. active commit & pivot
# virsh blockcommit generic hda --active --wait --verbose
Block commit: [100 %]
Now in synchronized phase

# virsh blockjob generic hda
Active Block Commit: [100 %]

# virsh blockjob generic hda --pivot

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.qcow2

4. take 1 external snapshot
# virsh snapshot-create-as generic s4 --disk-only --no-metadata --diskspec hda,file=/tmp/generic.s1
error: internal error: unable to execute QEMU command 'transaction': The feature 'snapshot' is not enabled

Actual results:


Expected results:
External snapshots can be taken

Additional info:
Cannot reproduce it on RHEL7.2
Comment 1 yangyang 2016-01-20 04:32 EST
Created attachment 1116560 [details]
libvirt log
Comment 2 Cole Robinson 2016-01-20 11:00:10 EST
Does this work on fedora 23?
If so, can you see if f23 qemu + rawhide libvirt works? That will tell us if it's qemu or libvirt
Comment 3 Han Han 2016-01-20 20:41:32 EST
I tried to use qmp-shell to execute the snapshot cmd in step4:
# qemu/scripts/qmp/qmp-shell /var/lib/libvirt/qemu/domain-nn/monitor.sock
Welcome to the QMP low-level shell!
Connected to QEMU 2.5.0

(QEMU) blockdev-snapshot-sync device=drive-virtio-disk0 snapshot-file=/tmp/kakka format=qcow2
{"error": {"class": "GenericError", "desc": "The feature 'snapshot' is not enabled"}}

From the error message, I think it's a bug in qemu.
Comment 4 Cole Robinson 2016-01-21 12:33:24 EST
jcody, any thoughts? seems like a snapshot regression in qemu 2.5
Comment 5 Jeff Cody 2016-01-22 12:00:35 EST
Indeed, it does look like a regression.  I've also confirmed that it is present in the current QEMU master upstream.  I'm working to bisect it now.

Here are some easy steps to reproduce with just QEMU (via QMP commands):

qemu-system-x86_64 -enable-kvm -drive file=$1,if=virtio -m 1024 -boot c -qmp stdio

{ "execute": "qmp_capabilities" }

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"tmp.qcow2","format": "qcow2" } }

{ "execute": "block-commit", "arguments": { "device": "virtio0" } }

{ "execute": "block-job-complete", "arguments": { "device": "virtio0" }}

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"tmp2.qcow2","format": "qcow2" } }
Comment 6 Jeff Cody 2016-01-25 20:20:17 EST
Git bisect shows this to be a regression caused by:

commit 3f09bfbc7bee812a44838f4c8b254007a9b86cab
Refs: v2.5.0-640-g3db34bf
AuthorDate: Tue Sep 15 11:58:23 2015 +0200
CommitDate: Fri Oct 16 15:34:30 2015 +0200

    block: Add and use bdrv_replace_in_backing_chain()

    This cleans up the mess we left behind in the mirror code after the
    previous patch. Instead of using bdrv_swap(), just change pointers.

    The interface change of the mirror job that callers must consider is
    that after job completion, their local BDS pointers still point to the
    same node now. qemu-img must change its code accordingly (which makes it
    easier to understand); the other callers stays unchanged because after
    completion they don't do anything with the BDS, but just with the job,
    and the job is still owned by the source BDS.
Comment 7 Cole Robinson 2016-01-26 12:16:43 EST
*** Bug 1302064 has been marked as a duplicate of this bug. ***
Comment 8 Jeff Cody 2016-01-30 09:40:26 EST
Submitted a patch upstream, and to qemu-stable, with a fix:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg06163.html
Comment 9 Jeff Cody 2016-02-01 11:29:21 EST
*** Bug 1300165 has been marked as a duplicate of this bug. ***
Comment 10 himbeere 2016-02-01 11:43:58 EST
Applying Jeffs patch solves the problem for me.

thanks.
Comment 11 Jan Kurik 2016-02-24 09:18:31 EST
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase
Comment 12 Fedora Update System 2016-03-17 18:31:02 EDT
qemu-2.5.0-10.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-1b264ab4a4
Comment 13 Fedora Update System 2016-03-18 10:55:50 EDT
qemu-2.5.0-10.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-1b264ab4a4
Comment 14 Fedora Update System 2016-03-26 13:59:51 EDT
qemu-2.5.0-10.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.