Bug 1291681

Summary: Pool creation fails if calamari is up and un-used for quite sometime
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Shubhendu Tripathi <shtripat>
Component: CalamariAssignee: Christina Meno <gmeno>
Calamari sub component: Back-end QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: ceph-eng-bugs, flucifre, gmeno, hnallurv, icolle, mbukatov, sankarshan, yweinste
Version: 1.3.1Flags: icolle: needinfo+
Target Milestone: rc   
Target Release: 2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:29:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1291304    
Attachments:
Description Flags
calamri log
none
cthulu log none

Description Shubhendu Tripathi 2015-12-15 12:31:37 UTC
Description of problem:
Pool creation api "POST /api/v2/cluster/<fsid>/pool" fails after trying for a long time. Here calamari_on_mon is running for quite sometime and un-unsed.

The command "supervisorctl status" shows calamari_running though-

-------------------------
[Node ~]# supervisorctl status
calamari-lite                    RUNNING    pid 28024, uptime 0:02:59
-------------------------

But the REST call from skyring, fails after waiting for a minute or so...


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Start calamri_on_mon using the command "supervisorctl restart all" on mon node
2. Keep it running for say 1 hour
3. From skyring submit a pool creation request

Actual results:
After waiting for a minute or so request fails

Expected results:
Pool creation should be successful even if calamari is un-used for a long time.


Additional info:
Observed that sometimes if calamari is not used for quite sometime, it goes down..

Comment 2 Shubhendu Tripathi 2015-12-15 12:38:04 UTC
The issue happens very randomly. Once failed, if we try again sometimes it works. But its very random as mentioned..

Comment 4 Shubhendu Tripathi 2015-12-16 03:18:10 UTC
Created attachment 1106273 [details]
calamri log

Comment 5 Shubhendu Tripathi 2015-12-16 03:21:03 UTC
Created attachment 1106274 [details]
cthulu log

Comment 6 Shubhendu Tripathi 2015-12-16 03:21:26 UTC
While pool creation, we use a dummy URL /api/v2/auth/login to retrieve the XSRF token which we pass in POST request for pool creation.

In case of failures I can see below error in calamri.log

----------------------
2015-12-15 04:14:43,981 - ERROR - django.request Internal Server Error: /api/v2/auth/login
Traceback (most recent call last):
  File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/handlers/base.py", line 103, in get_response
    resolver_match = resolver.resolve(request.path_info)
  File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/urlresolvers.py", line 319, in resolve
    for pattern in self.url_patterns:
  File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/urlresolvers.py", line 351, in url_patterns
    raise ImproperlyConfigured("The included urlconf %s doesn't have any patterns in it" % self.urlconf_name)
ImproperlyConfigured: The included urlconf calamari_web.urls doesn't have any patterns in it
-------------------------

Attached the log files for reference.

Comment 9 Mike McCune 2016-03-28 22:39:30 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 11 Shubhendu Tripathi 2016-06-13 10:15:11 UTC
I haven't seen this issue with recent builds..

Comment 12 Harish NV Rao 2016-06-13 10:20:48 UTC
Moving this defect to verified state for now based on comment 11.

Comment 14 errata-xmlrpc 2016-08-23 19:29:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1755