Bug 1217185 - Sometimes libvirt fails to update domain block file after blockcommit.
Summary: Sometimes libvirt fails to update domain block file after blockcommit.
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: x86_64
OS: Other
unspecified
medium
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-29 17:45 UTC by Matthew Schumacher
Modified: 2016-04-10 21:12 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Slackware 14.1 x86_64
Last Closed: 2016-04-10 21:12:21 UTC
Embargoed:


Attachments (Terms of Use)

Description Matthew Schumacher 2015-04-29 17:45:24 UTC
I just stumbled on another bug while snapshotting and think it's related to 1210903 and 1197592 as it seems like some sort of race condition because it depends on what logging is in place and doesn't happen every time.

Here are the details:

I wrote this test script to snapshot and commit over and over:

#!/bin/sh

while [ 1 ]; do

  echo "Starting snapshot test `date`"
  virsh snapshot-create-as test 20150429 20150429-backup --disk-only --atomic
  virsh domblklist test      
  virsh blockcommit test vda --active --pivot --verbose
  virsh snapshot-delete test 20150429 --metadata
  virsh domblklist test
  rm /glustervol1/vm/test/test.20150429
  echo "Ending snapshot test `date`"
  echo
  echo

  sleep 2

done

If I run libvirtd in the foreground with debug set to 1 I can't get it to fail, it does what it's supposed to do, snapshot and commit over and over.

If I run libvirtd in the foreground with debug set to 3, then I will always eventually get this:

Starting snapshot test Wed Apr 29 09:34:34 AKDT 2015
Domain snapshot 20150429 created
Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

Block Commit: [100 %]
Successfully pivoted
Domain snapshot 20150429 deleted

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

Ending snapshot test Wed Apr 29 09:34:35 AKDT 2015


Starting snapshot test Wed Apr 29 09:34:37 AKDT 2015
error: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

error: internal error: qemu block name '/glustervol1/vm/test/test.qcow2' doesn't match expected '/glustervol1/vm/test/test.20150429'

error: Domain snapshot not found: no domain snapshot with matching name '20150429'

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

rm: can't remove '/glustervol1/vm/test/test.20150429': No such file or directory
Ending snapshot test Wed Apr 29 09:34:37 AKDT 2015

At this point libvirt is confused about which file is the backing store because the first run did pivot after blockcommit, but didn't update the block file.  From the logs:

2015-04-29 17:33:41.052+0000: 25192: warning : qemuDomainObjTaint:1972 : Domain id=2 name='test' uuid=4b9cc25b-68d1-4ce8-8a65-2a378e255e36 is tainted: high-privileges
2015-04-29 17:34:37.322+0000: 25191: error : virDomainSnapshotAlignDisks:609 : unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name
2015-04-29 17:34:37.352+0000: 25194: error : qemuMonitorJSONDiskNameLookup:3977 : internal error: unable to find backing name for device drive-virtio-disk0
2015-04-29 17:34:37.354+0000: 25194: error : qemuMonitorJSONDiskNameLookupOne:3914 : internal error: qemu block name '/glustervol1/vm/test/test.qcow2' doesn't match expected '/glustervol1/vm/test/test.20150429'


So libvirt insists that the block file is: 

root@wasvirt2:/glustervol1/vm/waspbx# virsh domblklist test
Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

But that file isn't in use and isn't what qemu is using:

root@wasvirt2:/glustervol1/vm/waspbx# lsof | grep test
25300   /usr/bin/qemu-system-x86_64     /var/log/libvirt/qemu/test.log
25300   /usr/bin/qemu-system-x86_64     /var/log/libvirt/qemu/test.log
25300   /usr/bin/qemu-system-x86_64     /glustervol1/vm/test/test.qcow2

The only way to straighten this out is to destroy and start the domain.

Comment 1 Matthew Schumacher 2015-04-29 19:45:47 UTC
I can now confirm that I can replicate this issue for both qemu 2.3.0 and qemu 2.2.1, but qemu 2.1.3 works just fine which further indicates that this may be related to 1210903 and 1197592.

Comment 2 Peter Krempa 2015-04-30 07:27:20 UTC
Which version of libvirt is your distro using?

Comment 3 Matthew Schumacher 2015-04-30 17:11:19 UTC
libvirt-1.2.14-x86_64-1_SBo.tgz

From:

http://slackbuilds.org/repository/14.1/libraries/libvirt/

Comment 4 Matthew Schumacher 2015-05-11 20:52:38 UTC
Just tried this with libvirt-1.2.15 built with the same slackbuild script from the link above and still have the same issue with qemu-2.2.1 or qemu-2.3.0.

Comment 5 Cole Robinson 2016-04-10 21:12:21 UTC
Those two bugs you referenced in the first comment have patches in libvirt.git, at least version 1.3.2, so I'm guessing this is fixed upstream too. Closing, but if you reproduce with more recent libvirt+qemu, please reopen


Note You need to log in before you can comment on or make changes to this bug.