Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 723894

Summary:	Documentation: vmware deployments to NFS datastore can error out
Product:	[Retired] CloudForms Cloud Engine	Reporter:	wes hayutin <whayutin>
Component:	Documentation	Assignee:	Justin Clift <jclift>
Status:	CLOSED CURRENTRELEASE	QA Contact:	wes hayutin <whayutin>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	0.3.1	CC:	akarol, clalance, dajohnso, deltacloud-maint, kwade, morazi, ssachdev
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	723351	Environment:
Last Closed:		Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description wes hayutin 2011-07-21 13:37:21 UTC

+++ This bug was initially created as a clone of Bug #723351 +++

Description of problem:

It seems like when you bring up a vmware based deployment with more than two instances it takes quite some time and condor places the jobs in "hold"

going into condor and releasing the jobs resolves the issue, however I'm wondering if there is some default time out for condor that can be adjusted for vmware.


Recreate:
1. setup conductor for vmware
2. create a deployment w/ four or more instances
3. start the deployable
4. vmware will take 10-15 minutes to start



root@hp-dl180g6-01 ~]# condor_q


-- Submitter: hp-dl180g6-01.rhts.eng.bos.redhat.com : <10.16.65.63:41877> : hp-dl180g6-01.rhts.eng.bos.redhat.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   aeolus          7/19 16:05   0+00:41:39 R  0   0.0  job_1_frontend_21 
  18.0   aeolus          7/19 16:05   0+00:41:39 R  0   0.0  job_1_backend_22  
  19.0   aeolus          7/19 16:05   0+00:42:10 R  0   0.0  job_1_middle01_23 
  20.0   aeolus          7/19 16:05   0+00:42:40 R  0   0.0  job_1_middle02_24 
  21.0   aeolus          7/19 16:11   0+00:36:06 R  0   0.0  job_2_frontend_25 
  22.0   aeolus          7/19 16:13   0+00:00:00 H  0   0.0  job_vmware1_fronte
  23.0   aeolus          7/19 16:13   0+00:00:00 H  0   0.0  job_vmware1_backen
  24.0   aeolus          7/19 16:13   0+00:00:00 H  0   0.0  job_vmware1_middle
  25.0   aeolus          7/19 16:13   0+00:00:00 H  0   0.0  job_vmware1_middle
  26.0   aeolus          7/19 16:17   0+00:30:13 R  0   0.0  job_userquota01_fr
  27.0   aeolus          7/19 16:17   0+00:30:13 R  0   0.0  job_userquota01_ba
  28.0   aeolus          7/19 16:17   0+00:30:13 R  0   0.0  job_userquota01_mi
  29.0   aeolus          7/19 16:17   0+00:29:43 R  0   0.0  job_userquota01_mi
  30.0   aeolus          7/19 16:21   0+00:26:40 R  0   0.0  job_userquota02_fr
  31.0   aeolus          7/19 16:21   0+00:26:39 R  0   0.0  job_userquota02_ba
  32.0   aeolus          7/19 16:34   0+00:13:03 R  0   0.0  job_userquota03_fr
  33.0   aeolus          7/19 16:34   0+00:13:18 R  0   0.0  job_userquota03_ba
  34.0   aeolus          7/19 16:34   0+00:13:00 R  0   0.0  job_userquota03_mi
  35.0   aeolus          7/19 16:34   0+00:13:03 R  0   0.0  job_userquota03_mi
  36.0   aeolus          7/19 16:36   0+00:12:18 R  0   0.0  job_userquota04_fr
  37.0   aeolus          7/19 16:36   0+00:12:02 R  0   0.0  job_userquota04_ba
  38.0   aeolus          7/19 16:36   0+00:11:48 R  0   0.0  job_userquota04_mi
  39.0   aeolus          7/19 16:36   0+00:11:48 R  0   0.0  job_userquota04_mi

23 jobs; 0 idle, 19 running, 4 held
[root@hp-dl180g6-01 ~]# condor_release 22.0 23.0 24.0 25.0
Job 22.0 released
Job 23.0 released
Job 24.0 released
Job 25.0 released
[root@hp-dl180g6-01 ~]# condor_q


-- Submitter: hp-dl180g6-01.rhts.eng.bos.redhat.com : <10.16.65.63:41877> : hp-dl180g6-01.rhts.eng.bos.redhat.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   aeolus          7/19 16:05   0+00:42:05 R  0   0.0  job_1_frontend_21 
  18.0   aeolus          7/19 16:05   0+00:42:05 R  0   0.0  job_1_backend_22  
  19.0   aeolus          7/19 16:05   0+00:42:36 R  0   0.0  job_1_middle01_23 
  20.0   aeolus          7/19 16:05   0+00:43:06 R  0   0.0  job_1_middle02_24 
  21.0   aeolus          7/19 16:11   0+00:36:32 R  0   0.0  job_2_frontend_25 
  22.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_fronte
  23.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_backen
  24.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  25.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  26.0   aeolus          7/19 16:17   0+00:30:39 R  0   0.0  job_userquota01_fr
  27.0   aeolus          7/19 16:17   0+00:30:39 R  0   0.0  job_userquota01_ba
  28.0   aeolus          7/19 16:17   0+00:30:39 R  0   0.0  job_userquota01_mi
  29.0   aeolus          7/19 16:17   0+00:30:09 R  0   0.0  job_userquota01_mi
  30.0   aeolus          7/19 16:21   0+00:27:06 R  0   0.0  job_userquota02_fr
  31.0   aeolus          7/19 16:21   0+00:27:05 R  0   0.0  job_userquota02_ba
  32.0   aeolus          7/19 16:34   0+00:13:29 R  0   0.0  job_userquota03_fr
  33.0   aeolus          7/19 16:34   0+00:13:44 R  0   0.0  job_userquota03_ba
  34.0   aeolus          7/19 16:34   0+00:13:26 R  0   0.0  job_userquota03_mi
  35.0   aeolus          7/19 16:34   0+00:13:29 R  0   0.0  job_userquota03_mi
  36.0   aeolus          7/19 16:36   0+00:12:44 R  0   0.0  job_userquota04_fr
  37.0   aeolus          7/19 16:36   0+00:12:28 R  0   0.0  job_userquota04_ba
  38.0   aeolus          7/19 16:36   0+00:12:14 R  0   0.0  job_userquota04_mi
  39.0   aeolus          7/19 16:36   0+00:12:14 R  0   0.0  job_userquota04_mi

23 jobs; 4 idle, 19 running, 0 held
[root@hp-dl180g6-01 ~]# condor_q


-- Submitter: hp-dl180g6-01.rhts.eng.bos.redhat.com : <10.16.65.63:41877> : hp-dl180g6-01.rhts.eng.bos.redhat.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   aeolus          7/19 16:05   0+00:42:21 R  0   0.0  job_1_frontend_21 
  18.0   aeolus          7/19 16:05   0+00:42:21 R  0   0.0  job_1_backend_22  
  19.0   aeolus          7/19 16:05   0+00:42:52 R  0   0.0  job_1_middle01_23 
  20.0   aeolus          7/19 16:05   0+00:43:22 R  0   0.0  job_1_middle02_24 
  21.0   aeolus          7/19 16:11   0+00:36:48 R  0   0.0  job_2_frontend_25 
  22.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_fronte
  23.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_backen
  24.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  25.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  26.0   aeolus          7/19 16:17   0+00:30:55 R  0   0.0  job_userquota01_fr
  27.0   aeolus          7/19 16:17   0+00:30:55 R  0   0.0  job_userquota01_ba
  28.0   aeolus          7/19 16:17   0+00:30:55 R  0   0.0  job_userquota01_mi
  29.0   aeolus          7/19 16:17   0+00:30:25 R  0   0.0  job_userquota01_mi
  30.0   aeolus          7/19 16:21   0+00:27:22 R  0   0.0  job_userquota02_fr
  31.0   aeolus          7/19 16:21   0+00:27:21 R  0   0.0  job_userquota02_ba
  32.0   aeolus          7/19 16:34   0+00:13:45 R  0   0.0  job_userquota03_fr
  33.0   aeolus          7/19 16:34   0+00:14:00 R  0   0.0  job_userquota03_ba
  34.0   aeolus          7/19 16:34   0+00:13:42 R  0   0.0  job_userquota03_mi
  35.0   aeolus          7/19 16:34   0+00:13:45 R  0   0.0  job_userquota03_mi
  36.0   aeolus          7/19 16:36   0+00:13:00 R  0   0.0  job_userquota04_fr
  37.0   aeolus          7/19 16:36   0+00:12:44 R  0   0.0  job_userquota04_ba
  38.0   aeolus          7/19 16:36   0+00:12:30 R  0   0.0  job_userquota04_mi
  39.0   aeolus          7/19 16:36   0+00:12:30 R  0   0.0  job_userquota04_mi

23 jobs; 4 idle, 19 running, 0 held
[root@hp-dl180g6-01 ~]# condor_q


-- Submitter: hp-dl180g6-01.rhts.eng.bos.redhat.com : <10.16.65.63:41877> : hp-dl180g6-01.rhts.eng.bos.redhat.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   aeolus          7/19 16:05   0+00:42:36 R  0   0.0  job_1_frontend_21 
  18.0   aeolus          7/19 16:05   0+00:42:36 R  0   0.0  job_1_backend_22  
  19.0   aeolus          7/19 16:05   0+00:43:07 R  0   0.0  job_1_middle01_23 
  20.0   aeolus          7/19 16:05   0+00:43:37 R  0   0.0  job_1_middle02_24 
  21.0   aeolus          7/19 16:11   0+00:37:03 R  0   0.0  job_2_frontend_25 
  22.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_fronte
  23.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_backen
  24.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  25.0   aeolus          7/19 16:13   0+00:00:00 I  0   0.0  job_vmware1_middle
  26.0   aeolus          7/19 16:17   0+00:31:10 R  0   0.0  job_userquota01_fr
  27.0   aeolus          7/19 16:17   0+00:31:10 R  0   0.0  job_userquota01_ba
  28.0   aeolus          7/19 16:17   0+00:31:10 R  0   0.0  job_userquota01_mi
  29.0   aeolus          7/19 16:17   0+00:30:40 R  0   0.0  job_userquota01_mi
  30.0   aeolus          7/19 16:21   0+00:27:37 R  0   0.0  job_userquota02_fr
  31.0   aeolus          7/19 16:21   0+00:27:36 R  0   0.0  job_userquota02_ba
  32.0   aeolus          7/19 16:34   0+00:14:00 R  0   0.0  job_userquota03_fr
  33.0   aeolus          7/19 16:34   0+00:14:15 R  0   0.0  job_userquota03_ba
  34.0   aeolus          7/19 16:34   0+00:13:57 R  0   0.0  job_userquota03_mi
  35.0   aeolus          7/19 16:34   0+00:14:00 R  0   0.0  job_userquota03_mi
  36.0   aeolus          7/19 16:36   0+00:13:15 R  0   0.0  job_userquota04_fr
  37.0   aeolus          7/19 16:36   0+00:12:59 R  0   0.0  job_userquota04_ba
  38.0   aeolus          7/19 16:36   0+00:12:45 R  0   0.0  job_userquota04_mi
  39.0   aeolus          7/19 16:36   0+00:12:45 R  0   0.0  job_userquota04_mi

23 jobs; 4 idle, 19 running, 0 held
[root@hp-dl180g6-01 ~]# condor_q


-- Submitter: hp-dl180g6-01.rhts.eng.bos.redhat.com : <10.16.65.63:41877> : hp-dl180g6-01.rhts.eng.bos.redhat.com
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   aeolus          7/19 16:05   0+00:42:42 R  0   0.0  job_1_frontend_21 
  18.0   aeolus          7/19 16:05   0+00:42:42 R  0   0.0  job_1_backend_22  
  19.0   aeolus          7/19 16:05   0+00:43:13 R  0   0.0  job_1_middle01_23 
  20.0   aeolus          7/19 16:05   0+00:43:43 R  0   0.0  job_1_middle02_24 
  21.0   aeolus          7/19 16:11   0+00:37:09 R  0   0.0  job_2_frontend_25 
  22.0   aeolus          7/19 16:13   0+00:00:05 R  0   0.0  job_vmware1_fronte
  23.0   aeolus          7/19 16:13   0+00:00:05 R  0   0.0  job_vmware1_backen
  24.0   aeolus          7/19 16:13   0+00:00:04 R  0   0.0  job_vmware1_middle
  25.0   aeolus          7/19 16:13   0+00:00:05 R  0   0.0  job_vmware1_middle
  26.0   aeolus          7/19 16:17   0+00:31:16 R  0   0.0  job_userquota01_fr
  27.0   aeolus          7/19 16:17   0+00:31:16 R  0   0.0  job_userquota01_ba
  28.0   aeolus          7/19 16:17   0+00:31:16 R  0   0.0  job_userquota01_mi
  29.0   aeolus          7/19 16:17   0+00:30:46 R  0   0.0  job_userquota01_mi
  30.0   aeolus          7/19 16:21   0+00:27:43 R  0   0.0  job_userquota02_fr
  31.0   aeolus          7/19 16:21   0+00:27:42 R  0   0.0  job_userquota02_ba
  32.0   aeolus          7/19 16:34   0+00:14:06 R  0   0.0  job_userquota03_fr
  33.0   aeolus          7/19 16:34   0+00:14:21 R  0   0.0  job_userquota03_ba
  34.0   aeolus          7/19 16:34   0+00:14:03 R  0   0.0  job_userquota03_mi
  35.0   aeolus          7/19 16:34   0+00:14:06 R  0   0.0  job_userquota03_mi
  36.0   aeolus          7/19 16:36   0+00:13:21 R  0   0.0  job_userquota04_fr
  37.0   aeolus          7/19 16:36   0+00:13:05 R  0   0.0  job_userquota04_ba
  38.0   aeolus          7/19 16:36   0+00:12:51 R  0   0.0  job_userquota04_mi
  39.0   aeolus          7/19 16:36   0+00:12:51 R  0   0.0  job_userquota04_mi

23 jobs; 0 idle, 23 running, 0 held

--- Additional comment from matt on 2011-07-20 15:50:31 EDT ---

FYI - condor_q 22.0 -l | grep LastHoldReason -> "Create_Instance_Failure: Failed to perform transfer: Server returned nothing (no headers, no data)"

The jobs were running.

Should investigate why the transfer failed. Possibly a timing issue?

--- Additional comment from clalance on 2011-07-20 16:11:38 EDT ---

Yeah, those sorts of errors usually are some sort of timeout, or a bug in deltacloud itself.  At the very least, deltacloudd should always be returning an error code (and not no headers, no data).

Comment 1 Justin Clift 2011-07-29 11:06:02 UTC

Ugh, this is *not* good to blame on NFS.

Different NFS implementations have extremely different performance with VMware.

Using enterprise NAS gear, such as NetApp and similar, has _very good_ performance.  In comparison, using Linux based NFS servers doesn't.  "NFS" itself isn't to blame here.

Comment 2 wes hayutin 2011-08-01 19:12:23 UTC

BZ 723894 - VMware deployments to low spec NFS datastores error out
Low spec NFS datastores are not recommend due to poor performance

Comment 3 wes hayutin 2011-08-01 19:48:45 UTC

removing from tracker

Comment 4 wes hayutin 2011-08-01 19:58:02 UTC

release pending...

Comment 5 wes hayutin 2011-08-01 19:59:07 UTC

release pending...

Comment 7 wes hayutin 2011-12-08 13:57:53 UTC

closing out old bugs

Comment 8 wes hayutin 2011-12-08 14:10:44 UTC

perm close