1198810 – vm creation from template fails when multiple builds running concurrently

Bug 1198810 - vm creation from template fails when multiple builds running concurrently

Summary: vm creation from template fails when multiple builds running concurrently

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	core
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Ravishankar N
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-03-04 21:03 UTC by Paul Cuzner
Modified:	2018-10-24 10:52 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-5.0,glusterfs-4.1.4
Clone Of:
Environment:
Last Closed:	2018-10-24 10:52:52 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
screenshot of ovirt events window showing the timeframe (EST) of the issue (96.87 KB, image/png) 2015-03-04 21:03 UTC, Paul Cuzner	no flags	Details
gluster logs from gprfc085 (minus the cli.log) (2.85 MB, application/x-gzip) 2015-03-04 21:05 UTC, Paul Cuzner	no flags	Details
gluster logs from gprfc086 (minus the cli.log) (9.12 MB, application/x-gzip) 2015-03-04 21:07 UTC, Paul Cuzner	no flags	Details
gluster logs from gprfc087 (minus the cli.log) (7.46 MB, application/x-gzip) 2015-03-04 21:09 UTC, Paul Cuzner	no flags	Details
View All

Description Paul Cuzner 2015-03-04 21:03:45 UTC

Created attachment 998049 [details]
screenshot of ovirt events window showing the timeframe (EST) of the issue

Description of problem:
Environment - converged ovirt (3.5)+glusterfs (3.6.2)
When I create multiple vms from template I see self heal issues from a vol heal <vol> info command, and the vm creation itself fails.

Version-Release number of selected component (if applicable):


How reproducible:
2 in 3 attempts to build using templates failed. I believe the reason 1 run worked was the filesystem cache circumvented I/O to the gluster brick, reducing load.

Steps to Reproduce:
1. create ovirt3.5/glusterfs 3.6.2 environment
2. create a template to use to build vms
3. Clear the cache on the converged nodes (drop_caches)
4. use the "new vm" wizard to create the 3 vms - ensuring the storage allocation policy is clone

Actual results:
Vm creation for all 3 fail, and a vol heal command issued during the run shows something like

[root@gprfc085 ~]# date && gluster vol heal vmdomain info 
Tue Mar  3 14:41:45 EST 2015
Brick gprfc085.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc086.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc087.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

[root@gprfc085 ~]#

Expected results:
vm creation from clone should work whenever the volume is online - if multiple requests are made they should all work at the expense of extending the runtime

Additional info:

Comment 1 Paul Cuzner 2015-03-04 21:05:19 UTC

Created attachment 998050 [details]
gluster logs from gprfc085 (minus the cli.log)

Comment 2 Paul Cuzner 2015-03-04 21:07:30 UTC

Created attachment 998051 [details]
gluster logs from gprfc086 (minus the cli.log)

Comment 3 Paul Cuzner 2015-03-04 21:09:14 UTC

Created attachment 998052 [details]
gluster logs from gprfc087 (minus the cli.log)

Comment 4 Sahina Bose 2016-08-22 07:25:50 UTC

Kasturi, is this still an issue glusterfs 3.7.14

Comment 5 Yaniv Kaul 2017-11-16 13:26:43 UTC

(In reply to Sahina Bose from comment #4)
> Kasturi, is this still an issue glusterfs 3.7.14

Ping?

Comment 6 RamaKasturi 2017-11-17 14:31:33 UTC

Will check on this and get back on monday.

Comment 7 RamaKasturi 2017-11-19 09:43:06 UTC

Did the following steps to reproduce the bug and i do not hit the issue any more.

1) Created a vm and created template out of it.
2) Dropped cache on the hosts by running the command 'echo 3 > /proc/sys/vm/drop_caches'
3) Created three vms by selecting the storage allocation policy as 'clone'

I see that vms were created successfully and heal info shows zero entries.

[root@rhsqa-grafton1 ~]# gluster volume heal vmstore info
Brick 10.70.36.79:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.80:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.81:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Comment 8 Amar Tumballi 2018-10-24 10:52:52 UTC

Not seen in last 2 years!

Note You need to log in before you can comment on or make changes to this bug.