Bug 1198810

Summary:

vm creation from template fails when multiple builds running concurrently

Product:

[Community] GlusterFS

Reporter:

Paul Cuzner <pcuzner>

Component:

core

Assignee:

Ravishankar N <ravishankar>

Status:

CLOSED WORKSFORME

QA Contact:

Severity:

medium

Docs Contact:

Priority:

medium

Version:

mainline

CC:

atumball, bugs, hchiramm, knarra, pkarampu, ravishankar, sabose, sasundar

Target Milestone:

---

Keywords:

Performance, Triaged

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

glusterfs-5.0,glusterfs-4.1.4

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-10-24 10:52:52 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
screenshot of ovirt events window showing the timeframe (EST) of the issue	none
gluster logs from gprfc085 (minus the cli.log)	none
gluster logs from gprfc086 (minus the cli.log)	none
gluster logs from gprfc087 (minus the cli.log)	none

Description Paul Cuzner 2015-03-04 21:03:45 UTC

Created attachment 998049 [details]
screenshot of ovirt events window showing the timeframe (EST) of the issue

Description of problem:
Environment - converged ovirt (3.5)+glusterfs (3.6.2)
When I create multiple vms from template I see self heal issues from a vol heal <vol> info command, and the vm creation itself fails.

Version-Release number of selected component (if applicable):


How reproducible:
2 in 3 attempts to build using templates failed. I believe the reason 1 run worked was the filesystem cache circumvented I/O to the gluster brick, reducing load.

Steps to Reproduce:
1. create ovirt3.5/glusterfs 3.6.2 environment
2. create a template to use to build vms
3. Clear the cache on the converged nodes (drop_caches)
4. use the "new vm" wizard to create the 3 vms - ensuring the storage allocation policy is clone

Actual results:
Vm creation for all 3 fail, and a vol heal command issued during the run shows something like

[root@gprfc085 ~]# date && gluster vol heal vmdomain info 
Tue Mar  3 14:41:45 EST 2015
Brick gprfc085.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc086.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

Brick gprfc087.sbu.lab.eng.bos.redhat.com:/glusterfs/brick1/vmdomain/
/2ad90339-5c1b-4b0e-b728-3df651ecd025/images/66838053-2c16-41c7-ae54-47bd11ecf9c4/0dfb8a92-2e57-46a3-8d88-6c29419152fc - Possibly undergoing heal

Number of entries: 1

[root@gprfc085 ~]#

Expected results:
vm creation from clone should work whenever the volume is online - if multiple requests are made they should all work at the expense of extending the runtime

Additional info:

Comment 1 Paul Cuzner 2015-03-04 21:05:19 UTC

Created attachment 998050 [details]
gluster logs from gprfc085 (minus the cli.log)

Comment 2 Paul Cuzner 2015-03-04 21:07:30 UTC

Created attachment 998051 [details]
gluster logs from gprfc086 (minus the cli.log)

Comment 3 Paul Cuzner 2015-03-04 21:09:14 UTC

Created attachment 998052 [details]
gluster logs from gprfc087 (minus the cli.log)

Comment 4 Sahina Bose 2016-08-22 07:25:50 UTC

Kasturi, is this still an issue glusterfs 3.7.14

Comment 5 Yaniv Kaul 2017-11-16 13:26:43 UTC

(In reply to Sahina Bose from comment #4)
> Kasturi, is this still an issue glusterfs 3.7.14

Ping?

Comment 6 RamaKasturi 2017-11-17 14:31:33 UTC

Will check on this and get back on monday.

Comment 7 RamaKasturi 2017-11-19 09:43:06 UTC

Did the following steps to reproduce the bug and i do not hit the issue any more.

1) Created a vm and created template out of it.
2) Dropped cache on the hosts by running the command 'echo 3 > /proc/sys/vm/drop_caches'
3) Created three vms by selecting the storage allocation policy as 'clone'

I see that vms were created successfully and heal info shows zero entries.

[root@rhsqa-grafton1 ~]# gluster volume heal vmstore info
Brick 10.70.36.79:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.80:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Brick 10.70.36.81:/gluster_bricks/vmstore/vmstore
Status: Connected
Number of entries: 0

Comment 8 Amar Tumballi 2018-10-24 10:52:52 UTC

Not seen in last 2 years!