Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1235004

Summary: blockcommit on gluster can't be restarted after the previous job fails due to network connectivity loss
Product: Red Hat Enterprise Linux 7 Reporter: Peter Krempa <pkrempa>
Component: qemu-kvm-rhevAssignee: Jeff Cody <jcody>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: dyuan, huding, juzhang, knoel, mzhan, pzhang, rbalakri, virt-bugs, virt-maint, xfu, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1227551 Environment:
Last Closed: 2017-01-31 02:00:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1227551    

Description Peter Krempa 2015-06-23 17:49:24 UTC
I've added comments in square brackets to explain what's happening on the libvirt<->qemu interface.

+++ This bug was initially created as a clone of Bug #1227551 +++

Description of problem:
Broke the network connection during blockcommit , libvirt would report wrong info about the result of blockcommit and then it would fail to do blockcommit again .

Version-Release number of selected component (if applicable):
libvirt-1.2.15-2.el7.x86_64
qemu-kvm-rhev-2.3.0-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.prepare a healthy guest , and base image on gluster .
#virsh dumpxml gluster | grep disk -A 9
<disk type='network' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source protocol='gluster' name='gluster-vol1/r7q2.img'>
        <host name='$server_IP'/>
      </source>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

[the initial disk is based on gluster]

2.create snapshots for this guest .
# for i in {1..3}; do virsh snapshot-create-as gluster s$i --disk-only --diskspec vda,file=/tmp/s$i ; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created

[three snapshots on the *local filesystem* are created]


# virsh snapshot-list gluster
 Name                 Creation Time             State
------------------------------------------------------------
 s1                   2015-06-01 16:42:46 +0800 disk-snapshot
 s2                   2015-06-01 16:43:59 +0800 disk-snapshot
 s3                   2015-06-01 16:44:12 +0800 disk-snapshot

3.do blockcommit
in terminal 1 (do blockcommit):
#virsh blockcommit gluster vda --active --verbose --wait
Block Commit: [30 %]

[this is active layer block commit]

in terminal 2 (broke the network connection to gluster server ):
#  iptables -A OUTPUT -d $server_IP -j DROP

4.check result after a few minutes later :
in terminal 1 (blockcommit finished):
#virsh blockcommit gluster vda --active --verbose --wait
Block Commit: [100 %]
Now in synchronized phase

[The above output is due to a bug in virsh. This is being fixed in bug 1227551]

# virsh blockjob gluster vda --info
No current block job for vda

[The above command calls query-block-jobs.]

5.recover network connection and do blockcommit again .
#  iptables -D OUTPUT -d $server_IP -j DROP

# virsh blockcommit gluster vda --active --verbose --wait
error: internal error: unable to execute QEMU command 'block-commit': Error (Operation not permitted) flushing drive
[Libvirt doesn't think at this point that there is a pending block job, otherwise the error would be different, so the event propagated successfully and cleared the pending block job flag in libvirt]


Actual results:
As step 4 ,It reports that blockcommit finished and in mirror phase but actually it is not .
As step 5, fail to do blockcommit again after recover the network connection.

Expected results:
In step4 give correct info about the result of blockcommit .
[Handled by bug 1227551]
In step 5 , It can do blockcommit successfully after recover the network connection.

Comment 2 Ademar Reis 2015-07-10 20:34:13 UTC
May be related to Bug 1171261