Bug 1300209 - Cannot take external snapshot any more after active committing & pivoting
Summary: Cannot take external snapshot any more after active committing & pivoting
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 24
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1300165 1302064 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-20 09:30 UTC by Yang Yang
Modified: 2016-03-26 18:01 UTC (History)
19 users (show)

Fixed In Version: qemu-2.5.0-10.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-26 18:01:14 UTC
Type: Bug


Attachments (Terms of Use)
libvirt log (7.96 MB, text/plain)
2016-01-20 09:32 UTC, Yang Yang
no flags Details

Description Yang Yang 2016-01-20 09:30:07 UTC
Description of problem:
After active committing&pivoting, external snapshots cannot be taken
any more, error like:
# virsh snapshot-create-as generic hda --disk-only --no-metadata --diskspec hda,file=/tmp/generic.s1
error: internal error: unable to execute QEMU command 'transaction': The feature 'snapshot' is not enabled

Version-Release number of selected component (if applicable):
libvirt-1.3.1-1.fc24_v1.3.1_rc2.x86_64
qemu-kvm-2.5.0-3.fc24.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare a transient domain
# virsh list --transient
 Id    Name                           State
----------------------------------------------------
 12    generic                        running

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.qcow2

2. take 3 external snapshots
# virsh snapshot-create-as generic s1 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s1
Domain snapshot s1 created
# virsh console generic
dd a file in guest
# dd if=/dev/urandom of=ha bs=1024 count=102400

# virsh snapshot-create-as generic s2 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s2
Domain snapshot s2 created
dd a file in guest
# dd if=/dev/urandom of=haha bs=1024 count=102400

# virsh snapshot-create-as generic s3 --disk-only --no-metadata --diskspec hda,file=/var/lib/libvirt/images/generic.s3
Domain snapshot s3 created
dd a file in guest
# dd if=/dev/urandom of=hahaha bs=1024 count=102400

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.s3

# qemu-img info /var/lib/libvirt/images/generic.s3 --backing-chain
image: /var/lib/libvirt/images/generic.s3
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 248M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.s2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.s2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 56M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.s1
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.s1
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 132M
cluster_size: 65536
backing file: /var/lib/libvirt/images/generic.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/generic.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 2.0G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: true
    refcount bits: 16
    corrupt: false

3. active commit & pivot
# virsh blockcommit generic hda --active --wait --verbose
Block commit: [100 %]
Now in synchronized phase

# virsh blockjob generic hda
Active Block Commit: [100 %]

# virsh blockjob generic hda --pivot

# virsh domblklist generic
Target     Source
------------------------------------------------
hda        /var/lib/libvirt/images/generic.qcow2

4. take 1 external snapshot
# virsh snapshot-create-as generic s4 --disk-only --no-metadata --diskspec hda,file=/tmp/generic.s1
error: internal error: unable to execute QEMU command 'transaction': The feature 'snapshot' is not enabled

Actual results:


Expected results:
External snapshots can be taken

Additional info:
Cannot reproduce it on RHEL7.2

Comment 1 Yang Yang 2016-01-20 09:32:20 UTC
Created attachment 1116560 [details]
libvirt log

Comment 2 Cole Robinson 2016-01-20 16:00:10 UTC
Does this work on fedora 23?
If so, can you see if f23 qemu + rawhide libvirt works? That will tell us if it's qemu or libvirt

Comment 3 Han Han 2016-01-21 01:41:32 UTC
I tried to use qmp-shell to execute the snapshot cmd in step4:
# qemu/scripts/qmp/qmp-shell /var/lib/libvirt/qemu/domain-nn/monitor.sock
Welcome to the QMP low-level shell!
Connected to QEMU 2.5.0

(QEMU) blockdev-snapshot-sync device=drive-virtio-disk0 snapshot-file=/tmp/kakka format=qcow2
{"error": {"class": "GenericError", "desc": "The feature 'snapshot' is not enabled"}}

From the error message, I think it's a bug in qemu.

Comment 4 Cole Robinson 2016-01-21 17:33:24 UTC
jcody, any thoughts? seems like a snapshot regression in qemu 2.5

Comment 5 Jeff Cody 2016-01-22 17:00:35 UTC
Indeed, it does look like a regression.  I've also confirmed that it is present in the current QEMU master upstream.  I'm working to bisect it now.

Here are some easy steps to reproduce with just QEMU (via QMP commands):

qemu-system-x86_64 -enable-kvm -drive file=$1,if=virtio -m 1024 -boot c -qmp stdio

{ "execute": "qmp_capabilities" }

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"tmp.qcow2","format": "qcow2" } }

{ "execute": "block-commit", "arguments": { "device": "virtio0" } }

{ "execute": "block-job-complete", "arguments": { "device": "virtio0" }}

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"tmp2.qcow2","format": "qcow2" } }

Comment 6 Jeff Cody 2016-01-26 01:20:17 UTC
Git bisect shows this to be a regression caused by:

commit 3f09bfbc7bee812a44838f4c8b254007a9b86cab
Refs: v2.5.0-640-g3db34bf
AuthorDate: Tue Sep 15 11:58:23 2015 +0200
CommitDate: Fri Oct 16 15:34:30 2015 +0200

    block: Add and use bdrv_replace_in_backing_chain()

    This cleans up the mess we left behind in the mirror code after the
    previous patch. Instead of using bdrv_swap(), just change pointers.

    The interface change of the mirror job that callers must consider is
    that after job completion, their local BDS pointers still point to the
    same node now. qemu-img must change its code accordingly (which makes it
    easier to understand); the other callers stays unchanged because after
    completion they don't do anything with the BDS, but just with the job,
    and the job is still owned by the source BDS.

Comment 7 Cole Robinson 2016-01-26 17:16:43 UTC
*** Bug 1302064 has been marked as a duplicate of this bug. ***

Comment 8 Jeff Cody 2016-01-30 14:40:26 UTC
Submitted a patch upstream, and to qemu-stable, with a fix:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg06163.html

Comment 9 Jeff Cody 2016-02-01 16:29:21 UTC
*** Bug 1300165 has been marked as a duplicate of this bug. ***

Comment 10 himbeere 2016-02-01 16:43:58 UTC
Applying Jeffs patch solves the problem for me.

thanks.

Comment 11 Jan Kurik 2016-02-24 14:18:31 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

Comment 12 Fedora Update System 2016-03-17 22:31:02 UTC
qemu-2.5.0-10.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-1b264ab4a4

Comment 13 Fedora Update System 2016-03-18 14:55:50 UTC
qemu-2.5.0-10.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-1b264ab4a4

Comment 14 Fedora Update System 2016-03-26 17:59:51 UTC
qemu-2.5.0-10.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.