Bug 1300656

Summary: "Create Cluster" task fails fast without clear reason
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Martin Bukatovic <mbukatov>
Component: coreAssignee: Nishanth Thomas <nthomas>
core sub component: configuration QA Contact: sds-qe-bugs
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: medium    
Priority: unspecified CC: mkudlej, sankarshan
Version: 2   
Target Milestone: ---   
Target Release: 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhscon-ceph-0.0.23-1.el7scon.x86_64, rhscon-core-0.0.24-1.el7scon.x86_64, rhscon-ui-0.0.39-1.el7scon.noarch Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 05:33:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
cluster creatio nfailure none

Description Martin Bukatovic 2016-01-21 11:21:02 UTC
Description of problem
======================

Sometimes "Create Cluster" task fails without a clear reason. The lack of good
logs makes it hard to debug, but it may be connected with collectd/influxdb
auth failure.

This BZ is concerned with *the fact that the create cluster operation fails* in
this way every now and then. There is another BZ 1296215 which deals with the
fact that there is no good evidence what actually happens in similar cases.

Version-Release number of selected component
============================================

usm server machine:

~~~
# rpm -qa 'rhscon*'
rhscon-core-0.0.6-0.1.alpha1.el7.x86_64
rhscon-ui-0.0.6-0.1.alpha1.el7.noarch
rhscon-ceph-0.0.4-0.1.alpha1.el7.x86_64 
# rpm -qa 'salt*'
salt-2015.5.5-1.el7.noarch
salt-master-2015.5.5-1.el7.noarch
# rpm -qa 'influxdb*'
influxdb-0.9.5.1-1.x86_64
# rpm -qa | grep mongo
mongodb-server-2.6.5-4.el7ost.x86_64
mongodb-2.6.5-4.el7ost.x86_64
~~~

node machines (OSD only):

~~~
# rpm -qa 'rhscon*'
rhscon-agent-0.0.3-0.2.alpha1.el7.noarch
# rpm -qa 'salt*'
salt-minion-2015.5.5-1.el7.noarch
salt-2015.5.5-1.el7.noarch
# rpm -qa collectd
collectd-5.4.2-1.el7ost.x86_64
~~~

node machines (OSD and MON):

~~~
# rpm -qa 'rhscon*'
rhscon-agent-0.0.3-0.2.alpha1.el7.noarch
# rpm -qa 'salt*'
salt-minion-2015.5.5-1.el7.noarch
salt-master-2015.5.5-1.el7.noarch
salt-2015.5.5-1.el7.noarch
# rpm -qa collectd
collectd-5.4.2-1.el7ost.x86_64
~~~

How reproducible
================

It happens now and then, I noticed this in 2 cases out of 8 so far.

Steps to Reproduce
==================

1. Install skyring on server and prepare few hosts for cluster setup
2. Accept all nodes
3. Create new cluster using "Create Cluster" wizard

Actual results
==============

Create Cluster task fails quickly after the moment it was started (without
clear description of the problem available from the usm as noted in BZ 1296215).

Skyring console log shows this (note that operation started at 10:34 and failed
immediatelly):

~~~
[negroni] Started POST /api/v1/clusters
ESC[36m2016-01-21T10:34:57.829+01:00 DEBUG    cluster.go:173 POST_Clusters]ESC[0m Task Created: %!(EXTRA string=fb46db72-4718-4aca-a93d-c58e82817d83)
[negroni] Completed 202 Accepted in 6.674096ms
ESC[31m2016-01-21T10:34:57.837+01:00 ERROR    utils.go:133 FailTask]ESC[0m Cluster creation failed: CreateCluster(): function failed at python side
[negroni] Started GET /api/v1/events
[negroni] Completed 200 OK in 1.562161ms
[negroni] Started GET /api/v1/tasks/fb46db72-4718-4aca-a93d-c58e82817d83
[negroni] Completed 200 OK in 4.224033ms
[negroni] Started GET /views/modal/custom-modal.html
[negroni] Completed 304 Not Modified in 117.358µs
[negroni] Started GET /api/v1/users/me
[negroni] Completed 200 OK in 4.308777ms
[negroni] Started GET /api/v1/clusters
[negroni] Completed 200 OK in 2.523943ms
[negroni] Started GET /api/v1/tasks/fb46db72-4718-4aca-a93d-c58e82817d83
[negroni] Completed 200 OK in 3.7848ms
[negroni] Started GET /api/v1/unmanaged_nodes
[negroni] Completed 200 OK in 93.116745ms
[negroni] Started GET /api/v1/unmanaged_nodes
[negroni] Completed 200 OK in 95.89419ms
~~~

Note that all log examples presented here are cut to show only timeframe around
the create cluster failure.

There is no good evidence on the usm server:

 * nothing special in /var/log/mongodb/mongodb.log:

~~~
2016-01-21T10:32:47.868+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:07.871+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:17.869+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:37.878+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:33:47.866+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:07.867+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:17.867+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:37.866+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:47.871+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:57.831+0100 [conn3]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:34:59.840+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:07.870+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:17.255+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:32.257+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:37.255+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:35:47.252+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
2016-01-21T10:36:02.252+0100 [conn8]  authenticate db: skyring { authenticate: 1, nonce: "xxx", user: "admin", key: "xxx" }
~~~

 * nothing special in /var/log/skyring/bigfin.log:

~~~
2016-01-21T10:34:57.837+01:00 ERROR    utils.go:133 FailTask] Cluster creation failed: CreateCluster(): function failed at python side
~~~

 * nothing special in /var/log/skyring/bigfin.log:

~~~
2016-01-21T10:34:57.829+01:00 DEBUG    cluster.go:173 POST_Clusters] Task Created: %!(EXTRA string=fb46db72-4718-4aca-a93d-c58e82817d83)
~~~

 * nothing special in /var/log/influxdb/influxdb.log:

~~~
[wal] 2016/01/21 10:33:51 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:33:51 write to index of partition 1 took 4.263082ms
[wal] 2016/01/21 10:34:01 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:01 write to index of partition 1 took 8.919023ms
[wal] 2016/01/21 10:34:11 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:11 write to index of partition 1 took 6.906756ms
[wal] 2016/01/21 10:34:21 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:21 write to index of partition 1 took 9.810116ms
[wal] 2016/01/21 10:34:31 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:31 write to index of partition 1 took 5.614884ms
[wal] 2016/01/21 10:34:41 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:41 write to index of partition 1 took 5.868523ms
[wal] 2016/01/21 10:34:51 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:34:51 write to index of partition 1 took 13.605395ms
[wal] 2016/01/21 10:35:01 Flush due to idle. Flushing 10 series with 10 points and 620 bytes from partition 1
[wal] 2016/01/21 10:35:01 write to index of partition 1 took 7.658229ms
~~~

On the node machines:

 * there is nothing in /var/log/skynet/skynet.log (besides 1st init entry)
 * there is no event in /var/log/salt/minion.log during the failure:

~~~
2016-01-21 10:23:17,869 [salt.utils.schedule][INFO    ][325] Running scheduled job: __mine_interval
2016-01-21 10:23:17,871 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
2016-01-21 10:23:17,871 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with a max number of 2
2016-01-21 10:23:17,890 [salt.utils.schedule][DEBUG   ][7567] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20160121102317878909', 'pid': 7567, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'schedule': '__mine_interval'}
2016-01-21 10:23:17,893 [salt.crypt       ][DEBUG   ][7567] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 10:23:17,895 [salt.crypt       ][DEBUG   ][7567] Loaded minion key: /etc/salt/pki/minion/minion.pem
2016-01-21 10:23:17,909 [salt.utils.event ][DEBUG   ][7567] MinionEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pub.ipc
2016-01-21 10:23:17,909 [salt.utils.event ][DEBUG   ][7567] MinionEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pull.ipc
2016-01-21 10:23:17,912 [salt.utils.event ][DEBUG   ][7567] Sending event - data = {'_stamp': '2016-01-21T09:23:17.911837'}
2016-01-21 10:23:17,913 [salt.minion      ][DEBUG   ][325] Handling event 'salt/event/new_client\n\n\x81\xa6_stamp\xba2016-01-21T09:23:17.911837'
2016-01-21 10:23:17,915 [salt.utils.event ][DEBUG   ][7567] Sending event - data = {'_stamp': '2016-01-21T09:23:17.915037', 'clear': False, 'cmd': '_mine', 'tok': '4K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2', 'data': {}, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com'}
2016-01-21 10:23:17,916 [salt.minion      ][DEBUG   ][325] Handling event '_minion_mine\n\n\x86\xa6_stamp\xba2016-01-21T09:23:17.915037\xa5clear\xc2\xa3cmd\xa5_mine\xa3tok\xda\x01\x004K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2\xa4data\x80\xa2id\xda\x00\'mbukatov-usm1-node1.os1.phx2.redhat.com'
2016-01-21 10:23:17,917 [salt.crypt       ][DEBUG   ][325] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 10:23:19,944 [salt.crypt       ][DEBUG   ][7567] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:17,281 [salt.utils.schedule][INFO    ][325] Running scheduled job: __mine_interval
2016-01-21 11:23:17,283 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
2016-01-21 11:23:17,284 [salt.utils.schedule][DEBUG   ][325] schedule: This job was scheduled with a max number of 2
2016-01-21 11:23:17,307 [salt.utils.schedule][DEBUG   ][11671] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20160121112317293530', 'pid': 11671, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'schedule': '__mine_interval'}
2016-01-21 11:23:17,351 [salt.crypt       ][DEBUG   ][11671] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:17,353 [salt.crypt       ][DEBUG   ][11671] Loaded minion key: /etc/salt/pki/minion/minion.pem
2016-01-21 11:23:17,384 [salt.utils.event ][DEBUG   ][11671] MinionEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pub.ipc
2016-01-21 11:23:17,384 [salt.utils.event ][DEBUG   ][11671] MinionEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_b09f10ffda_pull.ipc
2016-01-21 11:23:17,386 [salt.utils.event ][DEBUG   ][11671] Sending event - data = {'_stamp': '2016-01-21T10:23:17.386092'}
2016-01-21 11:23:17,388 [salt.minion      ][DEBUG   ][325] Handling event 'salt/event/new_client\n\n\x81\xa6_stamp\xba2016-01-21T10:23:17.386092'
2016-01-21 11:23:17,390 [salt.utils.event ][DEBUG   ][11671] Sending event - data = {'_stamp': '2016-01-21T10:23:17.389795', 'clear': False, 'cmd': '_mine', 'tok': '4K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2', 'data': {}, 'id': 'mbukatov-usm1-node1.os1.phx2.redhat.com'}
2016-01-21 11:23:17,391 [salt.minion      ][DEBUG   ][325] Handling event '_minion_mine\n\n\x86\xa6_stamp\xba2016-01-21T10:23:17.389795\xa5clear\xc2\xa3cmd\xa5_mine\xa3tok\xda\x01\x004K\x98\xbc\xf9\xb9\xdaWh\xcf~\x99\xba9\x10\xf3\xdd\xaf\x81?\xf2`\xc3j|\xec\xb7\xb5\n\x90\xc0\t\xf9\xfa\xce\x05P`{}\x8b\x14\xd8\xcby\x94o0\xce\xfa%\xc6\x86\x07T\xb6\x9a\xe8\xe3\x9e\xc7\x9a\x8c\rbry\xf1\xc9\xf2J\x82\x9a>\xc4\xda\xd4\x9e\xaeo\xca\xe3\xc7A\xa7S\x8f\xc1y\xd8\xbd\xde\xb5<*\x8f36"}\n=tZ\xa8\xe5\xb1\xe6-|\xad\x0c\x00D\xb5\xfeq\x96\x10\x00\x0b\xe1\xe3\xb6&\x00#\xe5F\x03\x8e2\x0bN\x7f\xfb\xdcg\xc4\x92W\x17\xda\xce\xc1\xa94\xca\x14\xb7S\x91\x85\xec\x059\xdfs\xed$\xf0E\x899\xc2\x15SV\xfb \xb7]\xa2\xc879\xb7\xca\xe1Is\xfa\xa2\xad\xef\xc2\x0eXM)\xe9+\x15\x9f\x1b\x99\xe6[x\xc3\xd9\x17\xed\xa6\xe4~@A\x89}\x08\xd5WvV\xf3\x89\x82|\x14\x0c\x95q\xc2M\xf0r7\x05\x98\xc8Lg\xdc\x8ca\xa9\xcf\xf6?5d\x91\xc4\t\xd6\xb4\x8f"\xbdG\xbd\xe7\xe6O\xc2\xa4data\x80\xa2id\xda\x00\'mbukatov-usm1-node1.os1.phx2.redhat.com'
2016-01-21 11:23:17,393 [salt.crypt       ][DEBUG   ][325] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
2016-01-21 11:23:19,420 [salt.crypt       ][DEBUG   ][11671] Re-using SAuth for ('/etc/salt/pki/minion', 'mbukatov-usm1-node1.os1.phx2.redhat.com', 'tcp://172.16.180.90:4506')
~~~

 * it seems that collectd has some auth problem:

~~~
# journalctl -u collectd --since="2016-01-21 10:33" --until="2016-01-21 10:36"
-- Logs begin at Wed 2016-01-20 07:41:57 CET, end at Thu 2016-01-21 12:00:02 CET. --
Jan 21 10:33:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:33:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:34:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:34:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:35:02 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
Jan 21 10:35:32 mbukatov-usm1-node1.os1.phx2.redhat.com collectd[5402]: Collectd-InfluxDB.py: Error when sending metrics influxdb ([Errno 13] Permission denied)
~~~

Expected results
================

Create Cluster operation finishes with success.

Additional info
===============

When this issue happens, I can actually retry Create Cluster operation via usm
web ui, but it always immediately fails in the same way.

Moreover given the lack of details, I'm not entirely sure about the root cause
of this issue.

Comment 4 Martin Kudlej 2016-07-28 15:00:01 UTC
This is very old bug filed before integration with new ceph-installer.
If there is cluster creation failure I can find reason why it failed(example in the attachment).
Last verified with
ceph-ansible-1.0.5-31.el7scon.noarch
ceph-installer-1.0.14-1.el7scon.noarch
rhscon-ceph-0.0.37-1.el7scon.x86_64
rhscon-core-0.0.37-1.el7scon.x86_64
rhscon-core-selinux-0.0.37-1.el7scon.noarch
rhscon-ui-0.0.51-1.el7scon.noarch

Comment 5 Martin Kudlej 2016-07-28 15:00:34 UTC
Created attachment 1185152 [details]
cluster creatio nfailure