This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1462807 - GlusterCreateBrick job fails and there are no messages [NEEDINFO]
GlusterCreateBrick job fails and there are no messages
Status: ASSIGNED
Product: Red Hat Storage Console
Classification: Red Hat
Component: Gluster Integration (Show other bugs)
3
Unspecified Unspecified
unspecified Severity unspecified
: alpha
: 3-alpha
Assigned To: Shubhendu Tripathi
sds-qe-bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-19 11:10 EDT by Filip Balák
Modified: 2017-06-27 00:34 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
shtripat: needinfo? (rkanade)


Attachments (Terms of Use)

  None (edit)
Description Filip Balák 2017-06-19 11:10:34 EDT
Description of problem:
I try to create bricks via API. The job is created but remains `new` for a long time and then fails. After it fails I call `hostname/api/1.0/jobs/:job_id:/messages` API call but it returns empty list. `hostname/api/1.0/jobs` API call starts returning `{"errors":{"message":"Invalid JSON received."}}` (as described in BZ 1460762). On nodes provided to API call are no directories provided as path to GlusterCreateBrick API call (/bricks/fs_gluster01).

Version-Release number of selected component (if applicable):
tendrl-alerting-3.0-alpha.3.el7scon.noarch
tendrl-api-3.0-alpha.4.el7scon.noarch
tendrl-api-doc-3.0-alpha.4.el7scon.noarch
tendrl-api-httpd-3.0-alpha.4.el7scon.noarch
tendrl-commons-3.0-alpha.9.el7scon.noarch
tendrl-dashboard-3.0-alpha.4.el7scon.noarch
tendrl-node-agent-3.0-alpha.9.el7scon.noarch
tendrl-performance-monitoring-3.0-alpha.7.el7scon.noarch

How reproducible:
Probably 100%
I tried it few times and it behave always in the same matter.

Steps to Reproduce:
1. Import cluster with 4 gluster nodes.
2. Restart machines at the same time. (I do this because of loading machine state from snapshots)
3. After few minutes run:
```
curl -X POST -H 'Authorization: Bearer :access_token:' -d '{":node1_id:": {"/bricks/fs_gluster01": {"brick_name": "brick"}}, ":node2_id:": {"/bricks/fs_gluster01": {"brick_name": "brick"}}, "node3_id": {"/bricks/fs_gluster01": {"brick_name": "brick"}}, "node4_id": {"/bricks/fs_gluster01": {"brick_name": "brick"}}}' http://hostname/api/1.0/:cluster_id:/GlusterCreateBrick
```
4. Check `hostname/api/1.0/jobs/:job_id:` with job_id returned in response from previous step.

Actual results:
Job remains as `new` for a long time. After a while it fails and there are no messages about what failed. It also causes BZ 1460762.

Expected results:
Job should finish and create bricks or if it fails it should provide message with error what failed.

Additional info:
Comment 3 Nishanth Thomas 2017-06-20 00:08:06 EDT
I believe that it is failing due to https://github.com/Tendrl/gluster-integration/issues/315

Can you confirm? If you need any help, talk to @shubhendu
Comment 4 Filip Balák 2017-06-20 06:58:00 EDT
It might be the case. In /nodes/:node_id:/NodeContext/tags I see provisioner tags with two gluster nodes. Also GlusterCreateBrick task is not working for UI either after restart of machines.
Comment 5 Anup Nivargi 2017-06-20 13:04:03 EDT
Upstream fix at https://github.com/Tendrl/api/pull/213
Comment 6 Shubhendu Tripathi 2017-06-21 04:55:51 EDT
Filip, is it the scenario where the cluster was created from tendrl UI earlier and later after cleanup of etcd the same cluster is imported. If that's the case it could be related to https://github.com/Tendrl/gluster-integration/issues/315 as mentioned by Nishanth.

Also in simulation steps, I see step-2 `Restart machines at the same time. (I do this because of loading machine state from snapshots)` if nodes are still starting and `tendrl-gluster-inetgration` doesnt come as part of re-start, there is nobody to pick the create brick jobs and it would time out.

In latest builds all the tendrl services are marked now for re-start. With latest build I feel create bricks job should be picked after nodes are started back.

Would suggest to try with latest builds once. Also if its related to https://github.com/Tendrl/gluster-integration/issues/315, its still being worked out and would be available in later build.
Comment 7 Filip Balák 2017-06-23 11:00:19 EDT
The cluster was created by gluster cli, after that it was imported to tendrl.
It might have be caused by inactive `tendrl-gluster-inetgration`. There should be some message in that case.
Comment 10 Shubhendu Tripathi 2017-06-27 00:34:08 EDT
@rohan, thoughts on comment#7??

Note You need to log in before you can comment on or make changes to this bug.