1300656 – "Create Cluster" task fails fast without clear reason

Bug 1300656 - "Create Cluster" task fails fast without clear reason

Summary: "Create Cluster" task fails fast without clear reason

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Storage Console
Classification:	Red Hat Storage
Component:	core
Sub Component:
Version:	2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	2
Assignee:	Nishanth Thomas
QA Contact:	sds-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-01-21 11:21 UTC by Martin Bukatovic
Modified:	2018-11-19 05:33 UTC (History)
CC List:	2 users (show)
Fixed In Version:	rhscon-ceph-0.0.23-1.el7scon.x86_64, rhscon-core-0.0.24-1.el7scon.x86_64, rhscon-ui-0.0.39-1.el7scon.noarch
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-11-19 05:33:18 UTC
Embargoed:

Attachments	(Terms of Use)
cluster creatio nfailure (42.01 KB, image/png) 2016-07-28 15:00 UTC, Martin Kudlej	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1296215	0	unspecified	CLOSED	error messages have lack of information	2021-02-22 00:41:40 UTC

Internal Links: 1296215

Description Martin Bukatovic 2016-01-21 11:21:02 UTC

Description of problem
======================

Sometimes "Create Cluster" task fails without a clear reason. The lack of good
logs makes it hard to debug, but it may be connected with collectd/influxdb
auth failure.

This BZ is concerned with *the fact that the create cluster operation fails* in
this way every now and then. There is another BZ 1296215 which deals with the
fact that there is no good evidence what actually happens in similar cases.

Version-Release number of selected component
============================================

usm server machine:

~~~
# rpm -qa 'rhscon*'
rhscon-core-0.0.6-0.1.alpha1.el7.x86_64
rhscon-ui-0.0.6-0.1.alpha1.el7.noarch
rhscon-ceph-0.0.4-0.1.alpha1.el7.x86_64 
# rpm -qa 'salt*'
salt-2015.5.5-1.el7.noarch
salt-master-2015.5.5-1.el7.noarch
# rpm -qa 'influxdb*'
influxdb-0.9.5.1-1.x86_64
# rpm -qa | grep mongo
mongodb-server-2.6.5-4.el7ost.x86_64
mongodb-2.6.5-4.el7ost.x86_64
~~~

node machines (OSD only):

~~~
# rpm -qa 'rhscon*'
rhscon-agent-0.0.3-0.2.alpha1.el7.noarch
# rpm -qa 'salt*'
salt-minion-2015.5.5-1.el7.noarch
salt-2015.5.5-1.el7.noarch
# rpm -qa collectd
collectd-5.4.2-1.el7ost.x86_64
~~~

node machines (OSD and MON):

~~~
# rpm -qa 'rhscon*'
rhscon-agent-0.0.3-0.2.alpha1.el7.noarch
# rpm -qa 'salt*'
salt-minion-2015.5.5-1.el7.noarch
salt-master-2015.5.5-1.el7.noarch
salt-2015.5.5-1.el7.noarch
# rpm -qa collectd
collectd-5.4.2-1.el7ost.x86_64
~~~

How reproducible
================

It happens now and then, I noticed this in 2 cases out of 8 so far.

Steps to Reproduce
==================

1. Install skyring on server and prepare few hosts for cluster setup
2. Accept all nodes
3. Create new cluster using "Create Cluster" wizard

Actual results
==============

Create Cluster task fails quickly after the moment it was started (without
clear description of the problem available from the usm as noted in BZ 1296215).

Skyring console log shows this (note that operation started at 10:34 and failed
immediatelly):

~~~
[negroni] Started POST /api/v1/clusters
ESC[36m2016-01-21T10:34:57.829+01:00 DEBUG    cluster.go:173 POST_Clusters]ESC[0m Task Created: %!(EXTRA string=fb46db72-4718-4aca-a93d-c58e82817d83)
[negroni] Completed 202 Accepted in 6.674096ms
ESC[31m2016-01-21T10:34:57.837+01:00 ERROR    utils.go:133 FailTask]ESC[0m Cluster creation failed: CreateCluster(): function failed at python side
[negroni] Started GET /api/v1/events
[negroni] Completed 200 OK in 1.562161ms
[negroni] Started GET /api/v1/tasks/fb46db72-4718-4aca-a93d-c58e82817d83
[negroni] Completed 200 OK in 4.224033ms
[negroni] Started GET /views/modal/custom-modal.html
[negroni] Completed 304 Not Modified in 117.358µs
[negroni] Started GET /api/v1/users/me
[negroni] Completed 200 OK in 4.308777ms
[negroni] Started GET /api/v1/clusters
[negroni] Completed 200 OK in 2.523943ms
[negroni] Started GET /api/v1/tasks/fb46db72-4718-4aca-a93d-c58e82817d83
[negroni] Completed 200 OK in 3.7848ms
[negroni] Started GET /api/v1/unmanaged_nodes
[negroni] Completed 200 OK in 93.116745ms
[negroni] Started GET /api/v1/unmanaged_nodes
[negroni] Completed 200 OK in 95.89419ms
~~~

Note that all log examples presented here are cut to show only timeframe around
the create cluster failure.

There is no good evidence on the usm server:

 * nothing special in /var/log/mongodb/mongodb.log:

~~~
2016-01-21T10:32:47.868+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:07.871+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:17.869+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:37.878+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:47.866+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:07.867+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:17.867+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:37.866+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:47.871+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:57.831+0100 [conn3]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:59.840+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:07.870+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:17.255+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:32.257+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:37.255+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:47.252+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:36:02.252+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
~~~

 * nothing special in /var/log/skyring/bigfin.log:

~~~
2016-01-21T10:34:57.837+01:00 ERROR    utils.go:133 FailTask] Cluster creation failed: CreateCluster(): function failed at python side
~~~

 * nothing special in /var/log/skyring/bigfin.log:

~~~
2016-01-21T10:34:57.829+01:00 DEBUG    cluster.go:173 POST_Clusters] Task Created: %!(EXTRA string=fb46db72-4718-4aca-a93d-c58e82817d83)
~~~

 * nothing special in /var/log/influxdb/influxdb.log:

~~~
[wal] 2016/01/21 10:33:51 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:33:51 write to index of partition 1 took 4.263082ms
[wal] 2016/01/21 10:34:01 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:01 write to index of partition 1 took 8.919023ms
[wal] 2016/01/21 10:34:11 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:11 write to index of partition 1 took 6.906756ms
[wal] 2016/01/21 10:34:21 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:21 write to index of partition 1 took 9.810116ms
[wal] 2016/01/21 10:34:31 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:31 write to index of partition 1 took 5.614884ms
[wal] 2016/01/21 10:34:41 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:41 write to index of partition 1 took 5.868523ms
[wal] 2016/01/21 10:34:51 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:51 write to index of partition 1 took 13.605395ms
[wal] 2016/01/21 10:35:01 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:35:01 write to index of partition 1 took 7.658229ms
~~~

On the node machines:

 * there is nothing in /var/log/skynet/skynet.log (besides 1st init entry)
 * there is no event in /var/log/salt/minion.log during the failure:

~~~
2016-01-21 10:23:17,869 [salt.utils.schedule][INFO    ][325] Running scheduled job: __mine_interval
2016-01-21 10:23:17,871 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
2016-01-21 10:23:17,871 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with a max number of 2
2016-01-21 10:23:17,890 [salt.utils.schedule][DEBUG   ][7567] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20160121102317878909', 'pid': 7567, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'schedule': '__mine_interval'}
2016-01-21 10:23:17,893 [salt.crypt       ][DEBUG   ][7567] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 10:23:17,895 [salt.crypt       ][DEBUG   ][7567] Loaded minion key: /etc/salt/pki/minion/minion.pem
2016-01-21 10:23:17,909 [salt.utils.event ][DEBUG   ][7567] MinionEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pub.ipc
2016-01-21 10:23:17,909 [salt.utils.event ][DEBUG   ][7567] MinionEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pull.ipc
2016-01-21 10:23:17,912 [salt.utils.event ][DEBUG   ][7567] Sending event - data = {'_stamp': '2016-01-21T09:23:17.911837'}
2016-01-21 10:23:17,913 [salt.minion      ][DEBUG   ][325] Handling event 'salt/event/new_client\n\n\x81\xa6_stamp\xba2016-01-21T09:23:17.911837'
2016-01-21 10:23:17,915 [salt.utils.event ][DEBUG   ][7567] Sending event - data = {'_stamp': '2016-01-21T09:23:17.915037', 'clear': False, 'cmd': '_mine', 'tok': '4K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2', 'data': {}, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com'}
2016-01-21 10:23:17,916 [salt.minion      ][DEBUG   ][325] Handling event '_minion_mine\n\n\x86\xa6_stamp\xba2016-01-21T09:23:17.915037\xa5clear\xc2\xa3cmd\xa5_mine\xa3tok\xda\x01\x004K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2\xa4data\x80\xa2id\xda\x00\'mbukatov-usm1-node1.os1.phx2.redhat.com'
2016-01-21 10:23:17,917 [salt.crypt       ][DEBUG   ][325] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 10:23:19,944 [salt.crypt       ][DEBUG   ][7567] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:17,281 [salt.utils.schedule][INFO    ][325] Running scheduled job: __mine_interval
2016-01-21 11:23:17,283 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
2016-01-21 11:23:17,284 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with a max number of 2
2016-01-21 11:23:17,307 [salt.utils.schedule][DEBUG   ][11671] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20160121112317293530', 'pid': 11671, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'schedule': '__mine_interval'}
2016-01-21 11:23:17,351 [salt.crypt       ][DEBUG   ][11671] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:17,353 [salt.crypt       ][DEBUG   ][11671] Loaded minion key: /etc/salt/pki/minion/minion.pem
2016-01-21 11:23:17,384 [salt.utils.event ][DEBUG   ][11671] MinionEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pub.ipc
2016-01-21 11:23:17,384 [salt.utils.event ][DEBUG   ][11671] MinionEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pull.ipc
2016-01-21 11:23:17,386 [salt.utils.event ][DEBUG   ][11671] Sending event - data = {'_stamp': '2016-01-21T10:23:17.386092'}
2016-01-21 11:23:17,388 [salt.minion      ][DEBUG   ][325] Handling event 'salt/event/new_client\n\n\x81\xa6_stamp\xba2016-01-21T10:23:17.386092'
2016-01-21 11:23:17,390 [salt.utils.event ][DEBUG   ][11671] Sending event - data = {'_stamp': '2016-01-21T10:23:17.389795', 'clear': False, 'cmd': '_mine', 'tok': '4K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2', 'data': {}, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com'}
2016-01-21 11:23:17,391 [salt.minion      ][DEBUG   ][325] Handling event '_minion_mine\n\n\x86\xa6_stamp\xba2016-01-21T10:23:17.389795\xa5clear\xc2\xa3cmd\xa5_mine\xa3tok\xda\x01\x004K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2\xa4data\x80\xa2id\xda\x00\'mbukatov-usm1-node1.os1.phx2.redhat.com'
2016-01-21 11:23:17,393 [salt.crypt       ][DEBUG   ][325] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:19,420 [salt.crypt       ][DEBUG   ][11671] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
~~~

 * it seems that collectd has some auth problem:

~~~
# journalctl -u collectd --since="2016-01-21 10:33" --until="2016-01-21 10:36"
-- Logs begin at Wed 2016-01-20 07:41:57 CET, end at Thu 2016-01-21 12:00:02 CET. --
Jan 21 10:33:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:33:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:34:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:34:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:35:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:35:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
~~~

Expected results
================

Create Cluster operation finishes with success.

Additional info
===============

When this issue happens, I can actually retry Create Cluster operation via usm
web ui, but it always immediately fails in the same way.

Moreover given the lack of details, I'm not entirely sure about the root cause
of this issue.

Comment 4 Martin Kudlej 2016-07-28 15:00:01 UTC

This is very old bug filed before integration with new ceph-installer.
If there is cluster creation failure I can find reason why it failed(example in the attachment).
Last verified with
ceph-ansible-1.0.5-31.el7scon.noarch
ceph-installer-1.0.14-1.el7scon.noarch
rhscon-ceph-0.0.37-1.el7scon.x86_64
rhscon-core-0.0.37-1.el7scon.x86_64
rhscon-core-selinux-0.0.37-1.el7scon.noarch
rhscon-ui-0.0.51-1.el7scon.noarch

Comment 5 Martin Kudlej 2016-07-28 15:00:34 UTC

Created attachment 1185152 [details]
cluster creatio nfailure

Note You need to log in before you can comment on or make changes to this bug.