Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1631247

Summary: Issue enabling cluster.use-compound-fops with libgfapi application running
Product: [Community] GlusterFS Reporter: Paolo Margara <paolo.margara>
Component: libgfapiAssignee: bugs <bugs>
Status: CLOSED NEXTRELEASE QA Contact: bugs <bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: atumball, bugs, pasik
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-7.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-14 10:36:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paolo Margara 2018-09-20 09:58:46 UTC
Description of problem:

I'm running ovirt with libgfapi enabled with gluster 3.12.13 and when I set "cluster.use-compound-fops" to "on" every VMs are paused due to a storage IO error while the file system continue to be accessible through fuse client (only libgfapi application [qemu] stop working).


Version-Release number of selected component (if applicable): 
* gluster 3.12.13
* qemu 2.10.0-21.el7_5.4.1
* ovirt 4.2.6

How reproducible:
On an ovirt 4.2.6 hc installation configured with libgfapi enabled and gluster 3.12.13 runs:

gluster volume set $vm_images_volume_name cluster.use-compound-fops on

When this command is executed every VMs are paused due to a storage IO error while the file system continue to be accessible through fuse client (only libgfapi application stop working). In the qemu log file I could see these gluster related messages:

2018-09-14T11:49:37.020942Z qemu-kvm: terminating on signal 15 from pid
1513 (/usr/sbin/libvirtd)
2018-09-14T11:49:42.766431Z qemu-kvm: Failed to flush the L2 table
cache: Input/output error
2018-09-14T11:49:44.766853Z qemu-kvm: Failed to flush the refcount block
cache: Input/output error
[2018-09-14 11:49:44.869112] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-1: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869284] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-0: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869515] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-2: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869639] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-3: All subvolumes are down. Going
offline until atleast one of them comes back up.
[2018-09-14 11:49:44.869823] E [MSGID: 108006]
[afr-common.c:5118:__afr_handle_child_down_event]
0-vm-images-repo-demo-replicate-4: All subvolumes are down. Going
offline until atleast one of them comes back up.
2018-09-14 11:49:45.827+0000: shutting down, reason=destroyed


If I set "cluster.use-compound-fops" to "off" everything restart working correctly again.




Steps to Reproduce:
1. just set "cluster.use-compound-fops" to "on" on gluster volume that host VMs images used by qemu with libgfapi


Actual results:
if I set "cluster.use-compound-fops" to "on" every VMs runned by qemu with libgfapi report that all subvolumes are down


Expected results:
if set "cluster.use-compound-fops" to "on" every VMs should continue to work correctly


Additional info:
let me know if you need more info/log file to figure out the source of the problem.

Comment 1 Shyamsundar 2018-10-23 14:55:12 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 2 Amar Tumballi 2019-06-14 10:36:37 UTC
We now removed all reference to compound-fops.