Bug 1198810

Summary: vm creation from template fails when multiple builds running concurrently
Product: [Community] GlusterFS Reporter: Paul Cuzner <pcuzner>
Component: coreAssignee: Ravishankar N <ravishankar>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: atumball, bugs, hchiramm, knarra, pkarampu, ravishankar, sabose, sasundar
Target Milestone: ---Keywords: Performance, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-5.0,glusterfs-4.1.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-24 10:52:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot of ovirt events window showing the timeframe (EST) of the issue
none
gluster logs from gprfc085 (minus the cli.log)
none
gluster logs from gprfc086 (minus the cli.log)
none
gluster logs from gprfc087 (minus the cli.log) none

Description Paul Cuzner 2015-03-04 21:03:45 UTC
Created attachment 998049 [details]
screenshot of ovirt events window showing the timeframe (EST) of the issue

Description of problem:
Environment - converged ovirt (3.5)+glusterfs (3.6.2)
When I create multiple vms from template I see self heal issues from a vol heal <vol> info command, and the vm creation itself fails.

Version-Release number of selected component (if applicable):


How reproducible:
2 in 3 attempts to build using templates failed. I believe the reason 1 run worked was the filesystem cache circumvented I/O to the gluster brick, reducing load.

Steps to Reproduce:
1. create ovirt3.5/glusterfs 3.6.2 environment
2. create a template to use to build vms
3. Clear the cache on the converged nodes (drop_caches)
4. use the "new vm" wizard to create the 3 vms - ensuring the storage allocation policy is clone

Actual results:
Vm creation for all 3 fail, and a vol heal command issued during the run shows something like

[root@gprfc085 ~]# date && gluster vol heal vmdomain info 
Tue Mar  3 14:41:45 EST 2015
Brick gprfc085.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc086.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc087.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

[root@gprfc085 ~]#

Expected results:
vm creation from clone should work whenever the volume is online - if multiple requests are made they should all work at the expense of extending the runtime

Additional info:

Comment 1 Paul Cuzner 2015-03-04 21:05:19 UTC
Created attachment 998050 [details]
gluster logs from gprfc085 (minus the cli.log)

Comment 2 Paul Cuzner 2015-03-04 21:07:30 UTC
Created attachment 998051 [details]
gluster logs from gprfc086 (minus the cli.log)

Comment 3 Paul Cuzner 2015-03-04 21:09:14 UTC
Created attachment 998052 [details]
gluster logs from gprfc087 (minus the cli.log)

Comment 4 Sahina Bose 2016-08-22 07:25:50 UTC
Kasturi, is this still an issue glusterfs 3.7.14

Comment 5 Yaniv Kaul 2017-11-16 13:26:43 UTC
(In reply to Sahina Bose from comment #4)
> Kasturi, is this still an issue glusterfs 3.7.14

Ping?

Comment 6 RamaKasturi 2017-11-17 14:31:33 UTC
Will check on this and get back on monday.

Comment 7 RamaKasturi 2017-11-19 09:43:06 UTC
Did the following steps to reproduce the bug and i do not hit the issue any more.

1) Created a vm and created template out of it.
2) Dropped cache on the hosts by running the command 'echo 3 > /proc/sys/vm/drop_caches'
3) Created three vms by selecting the storage allocation policy as 'clone'

I see that vms were created successfully and heal info shows zero entries.

[root@rhsqa-grafton1 ~]# gluster volume heal vmstore info
Brick 10.70.36.79:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.80:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.81:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Comment 8 Amar Tumballi 2018-10-24 10:52:52 UTC
Not seen in last 2 years!