Bug 1273559

Summary: Calamari Rest API: "api/v2/cluster/<fsid>/cli " throws 503 server error
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: rakesh-gm <rgowdege>
Component: CalamariAssignee: Boris Ranto <branto>
Calamari sub component: Back-end QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Status: CLOSED WONTFIX Docs Contact:
Severity: medium    
Priority: unspecified CC: anharris, ceph-eng-bugs, flucifre, gmeno, hnallurv, kdreyer, vakulkar
Version: 1.3.1   
Target Milestone: rc   
Target Release: 1.3.4   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-20 20:56:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description rakesh-gm 2015-10-20 17:07:33 UTC
Description of problem:
calamari Rest API: 
In this calamari Rest API: 
api/v2/cluster/<fsid>/cli

when used ceph commands to POST, there is 503 error and reports "No mon servers are running" 

I have used POST like this in the web: 
{"command": "osd dump"}

and the output is: 
--------------------------
HTTP 503 SERVICE UNAVAILABLE
Vary: Accept
Content-Type: text/html; charset=utf-8
Allow: POST, OPTIONS

{
    "detail": "No mon servers are responding"
}
---------------------------

but my mons are working fine and services are running properly. 
the entry of  /var/log/calamari/cthulhu.log is here : http://pastebin.test.redhat.com/321246
--------

Comment 3 Christina Meno 2015-10-21 19:44:27 UTC
apache user is not able to run salt commands
I'll need to investigate further.

The only affected component is the CLI endpoint.
Going to fix this in 1.3.2

Comment 4 Harish NV Rao 2015-10-28 16:28:49 UTC
Hi Gregory,

Any workaround available from ceph side for this issue? This defect is affecting the automation progress.

Please let me know.

Regards,
Harish

Comment 5 Christina Meno 2015-10-28 18:30:02 UTC
There is no workaround. It is broken and upstream tests don't cover that endpoint yet. Would you please help me understand how it is holding up the automation progress?

Comment 6 Vasu Kulkarni 2015-10-28 18:52:01 UTC
Gregory,

I think he meant the calarmari api tests that we wanted to run using this endpoint, we wanted to cover few other tests using this endpoint(like run  cli commands that modifies cluster)  and those will be blocked, It is fine if you want to fix it in 1.3.2, we can take it up during 1.3.2.

Comment 7 Christina Meno 2015-12-10 23:30:26 UTC
add upstream ticket here

Comment 8 Christina Meno 2016-02-12 16:26:33 UTC
I thought that this would be fixed by https://github.com/ceph/calamari/commit/fd4c820f907d288c9cc64cddb765dedd2d4b7268

but there is more going on here.

The problem is that these commands are issued by the wsgi app which is not running as the same uid as salt-master so when we run the commands here:
https://github.com/ceph/calamari/blob/1.3/rest-api/calamari_rest/views/remote_view_set.py#L48

we get failures trying to write to the root owned salt log file

I see several solutions to this:
1. fix https://github.com/ceph/calamari/blob/1.3/salt/local/relax_salt_perms.sls
to include this log dir and make sure it sticks

2. move the mon_command implementation into cthulhu


3. update salt and see if they've fixed the provisions to allow other users to run a subset of salt commands


An orthogonal problem is that the upstream integration test won't catch this since they are running

Comment 9 Boris Ranto 2016-10-11 14:34:56 UTC
I've been trying to make this work and I do have a few notes:
 - relaxing the salt permissions does not help, the salt does not allow non-root users to run commands by default from what I can see, we would need to at least allow apache (www-user, any other httpd user) and even that might not be enough
 - updating salt won't help as the later versions of salt are not compatible with calamari at the moment and I think we would still hit the first issue
 - moving the RemoteViewSet functions to cthulhu does seem to help though, wip patch:

https://github.com/ceph/calamari/commit/44994d317b7998360e7b5d042e85eb6226866e18

Comment 10 Boris Ranto 2016-10-11 20:54:24 UTC
This should be fixed by this PR which also contains fix for bz1347137

https://github.com/ceph/calamari/pull/492