Bug 1630744

Summary: Allow configuring aio=native for gluster storage domains
Product: [oVirt] ovirt-engine Reporter: Sahina Bose <sabose>
Component: BLL.VirtAssignee: Sahina Bose <sabose>
Status: CLOSED CURRENTRELEASE QA Contact: Shir Fishbain <sfishbai>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.6CC: bugs, ebenahar, guillaume.pavese, lveyde, mtessun, nichawla, rbarry, sabose, sasundar, seamurph, sfishbai, tnisan, ykaul
Target Milestone: ovirt-4.2.7Flags: rule-engine: ovirt-4.2+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.2.7.3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-02 14:35:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Gluster RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1616270    

Description Sahina Bose 2018-09-19 07:38:35 UTC
Description of problem:

For disks that are on gluster storage domain, there should be an option to start the VM disks with aio=native option.

As per results in Bug 1616270, there was a significant gain while using this option for certain workloads. We are in the process of evaluating other workloads, hence requested for a way to configure this.

Version-Release number of selected component (if applicable):
4.2


How reproducible:
NA

Comment 1 Martin Tessun 2018-09-19 11:16:41 UTC
Isn't this first of all a gluster issue that should be fixed in gluster primarily?

Having a workaround in RHV might be an interim solution, but BZ #1616270 should be handled with high prio as well, esp. as it probably also affects "normal" gluster deployments.

Comment 2 Michal Skrivanek 2018-09-19 11:21:13 UTC
feel free to make the change as you see fit for gluster.
It's not configurable in any way right now, but the setting can be flipped easily if it works for all workloads in https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L1766

if it is supposed to be configurable by end user then this becomes a bit more complicated and probably should be tracked as a feature

Comment 3 Sahina Bose 2018-09-19 13:28:17 UTC
(In reply to Martin Tessun from comment #1)
> Isn't this first of all a gluster issue that should be fixed in gluster
> primarily?
> 
> Having a workaround in RHV might be an interim solution, but BZ #1616270
> should be handled with high prio as well, esp. as it probably also affects
> "normal" gluster deployments.

The bug 1616270 is dependent on 2 bugs, a gluster bug which is being fixed and the change of aio option (this is not a workaround)

Comment 4 Tal Nisan 2018-09-20 13:17:19 UTC
(In reply to Michal Skrivanek from comment #2)
> feel free to make the change as you see fit for gluster.
> It's not configurable in any way right now, but the setting can be flipped
> easily if it works for all workloads in
> https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/
> vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/
> LibvirtVmXmlBuilder.java#L1766
> 
> if it is supposed to be configurable by end user then this becomes a bit
> more complicated and probably should be tracked as a feature

I agree with Michal here, having native io set for all disks residing on a gluster storage domain is doable quite easily (although the storage domain info is missing in LibvirtVmXmlBuilder but that can be handled.
Letting the user choosing whether io will be native or threads is an RFE which will require more work

Comment 5 Sahina Bose 2018-09-21 06:16:15 UTC
(In reply to Tal Nisan from comment #4)
> (In reply to Michal Skrivanek from comment #2)
> > feel free to make the change as you see fit for gluster.
> > It's not configurable in any way right now, but the setting can be flipped
> > easily if it works for all workloads in
> > https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/
> > vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/
> > LibvirtVmXmlBuilder.java#L1766
> > 
> > if it is supposed to be configurable by end user then this becomes a bit
> > more complicated and probably should be tracked as a feature
> 
> I agree with Michal here, having native io set for all disks residing on a
> gluster storage domain is doable quite easily (although the storage domain
> info is missing in LibvirtVmXmlBuilder but that can be handled.
> Letting the user choosing whether io will be native or threads is an RFE
> which will require more work

Ok, we'll run further tests before making the switch. In the meanwhile, do we know why aio=threads was used for file based storage domains and not aio=native? Any data to suggest threads is better?

Comment 7 Tal Nisan 2018-10-07 13:35:34 UTC
> Ok, we'll run further tests before making the switch. In the meanwhile, do
> we know why aio=threads was used for file based storage domains and not
> aio=native? Any data to suggest threads is better?

I recall Yaniv wrote something about that a while ago, Yaniv can you please advise?

Comment 8 Yaniv Kaul 2018-10-07 17:59:42 UTC
(In reply to Tal Nisan from comment #7)
> > Ok, we'll run further tests before making the switch. In the meanwhile, do
> > we know why aio=threads was used for file based storage domains and not
> > aio=native? Any data to suggest threads is better?
> 
> I recall Yaniv wrote something about that a while ago, Yaniv can you please
> advise?

See https://bugzilla.redhat.com/show_bug.cgi?id=1305886 for a long discussion.
Specifically, https://bugzilla.redhat.com/show_bug.cgi?id=1305886#c4 .

I'm not sure if anything has changed since.

Comment 9 Sahina Bose 2018-10-08 05:19:09 UTC
Thanks, Yaniv and Tal. I've submitted a patch that switches to aio=native for gluster storage domains from cluster version 4.2. The config option also provides flexibility for users to switch if needed

Comment 10 Shir Fishbain 2018-10-14 14:18:59 UTC
    
 <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native' iothread='1'/>
      <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge11__volume__0/2ad4743c-7ef3-4797-942f-8b6ac75f8d6c/images/7a3b5fae-56ee-4069-b77f-9f46795cc59a/29230098-79b
0-4f39-b7f7-97946462f11e'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge11__volume__0/2ad4743c-7ef3-4797-942f-8b6ac75f8d6c/images/7a3b5fae-56ee-4069-b77f-9f46795cc59a/f89d5ab0-b
9a8-4643-940f-820151762b18'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <serial>7a3b5fae-56ee-4069-b77f-9f46795cc59a</serial>
      <boot order='1'/>
      <alias name='ua-7a3b5fae-56ee-4069-b77f-9f46795cc59a'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>

Is aio=native needs to be that way? (io='native')

Comment 11 Sahina Bose 2018-10-15 05:33:34 UTC
I checked by looking at 
#ps -ef | grep qemu | grep native
and saw:
file=gluster://rhsdev-grafton2.lab.eng.blr.redhat.com:24007/vmstore/288cfe58-913e-48f9-82a3-bb966e00ba15/images/8f9b1c65-e782-4e70-bc9a-9b7595b7fbd3/c0b08836-1b1a-496f-878e-bc4295728827,file.debug=4,format=qcow2,if=none,id=drive-ua-8f9b1c65-e782-4e70-bc9a-9b7595b7fbd3,serial=8f9b1c65-e782-4e70-bc9a-9b7595b7fbd3,cache=none,werror=stop,rerror=stop,aio=native


and the domain xml has:
 <disk type='network' device='disk' snapshot='no'>
        <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native' discard='unmap'/>
        <source protocol='gluster' name='vmstore/288cfe58-913e-48f9-82a3-bb966e00ba15/images/cc684867-a217-42ac-a25b-a671a272c12a/50946f49-bf26-4e9e-9686-2c8aae4c63da' tlsFromConfig='0'>
          <host name='rhsdev-grafton2.lab.eng.blr.redhat.com' port='24007'/>
        </source>


The above was for network disk type - seems similar to the output seen for disks for gluster fuse mount

Comment 12 Shir Fishbain 2018-10-15 13:40:19 UTC
Verified
file=/rhev/datacenter/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge11__volume__0/2ad4743c-7ef3-4797-942f-8b6ac75f8d6c/images/7a3b5fae-56ee-4069-b77f-9f46795cc59a/29230098-79b0-4f39-b7f7-97946462f11e,format=qcow2,if=none,id=drive-ua-7a3b5fae-56ee-4069-b77f-9f46795cc59a,serial=7a3b5fae-56ee-4069-b77f-9f46795cc59a,werror=stop,rerror=stop,cache=none,aio=native

Comment 13 SATHEESARAN 2018-10-18 18:31:41 UTC
Why is this bug still in MODIFIED state, but the fix is already available and verified ?

Comment 14 Elad 2018-10-18 18:54:46 UTC
Should have been moved to VERIFIED by a script (since it has 'verified_upstream' in the QA whiteboard).

Comment 15 Sandro Bonazzola 2018-11-02 14:35:52 UTC
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.