Bug 1327983 - Mon Install Hangs for ever
Summary: Mon Install Hangs for ever
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Storage Console
Classification: Red Hat
Component: ceph-installer
Version: 2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 2
Assignee: Alfredo Deza
QA Contact: Rachana Patel
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-18 08:12 UTC by Nishanth Thomas
Modified: 2016-08-23 19:49 UTC (History)
9 users (show)

Fixed In Version: ceph-installer-1.0.5-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 19:49:15 UTC
Target Upstream Version:


Attachments (Terms of Use)
ceph installer logs (604.96 KB, text/plain)
2016-04-18 08:12 UTC, Nishanth Thomas
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:1754 0 normal SHIPPED_LIVE New packages: Red Hat Storage Console 2.0 2017-04-18 19:09:06 UTC

Description Nishanth Thomas 2016-04-18 08:12:40 UTC
Created attachment 1148124 [details]
ceph installer logs

Description of problem:

invoking /api/mon/install/ hangs for ever

input : 
curl -d "{\"calamari\": true, \"hosts\": [\"dhcp46-139.lab.eng.blr.redhat.com\"],\"redhat_storage\":false,\"redhat_use_cdn\":true}" http://dhcp46-65.lab.eng.blr.redhat.com:8181/api/mon/install/

Looks like fix for 1322907 causing this issue
logs are attached

Version-Release number of selected component (if applicable):
ceph-ansible-1.0.5-3.el7.noarch.rpm                
ceph-installer-1.0.4-1.el7.noarch.rpm  
calamari-server-1.4.0-0.5.rc8.el7cp

How reproducible:

Always

Comment 2 Alfredo Deza 2016-04-18 14:57:13 UTC
In /var/log/messages for dhcp46-139.lab.eng.blr.redhat.com I can see:

Apr 18 19:15:23 dhcp46-139 salt-minion: [ERROR   ] The Salt Master has cached the public key for this node, this salt minion will wait for 10 seconds before attempting to re-authenticate
Apr 18 19:15:33 dhcp46-139 salt-minion: [ERROR   ] The Salt Master has cached the public key for this node, this salt minion will wait for 10 seconds before attempting to re-authenticate

That seems like a potential issue on the salt-master (dhcp46-65).

Comment 3 Christina Meno 2016-04-18 22:56:27 UTC
After Shubhendu reproduced this for us Thank you!

analysis:
seems like the issue is being caused by ceph-installer task model not allowing unicode to come out of ansible stdout and into SQLITE

excerpt from /var/log/messages:
Apr 19 01:46:49 dhcp47-78 celery: [2016-04-19 01:46:49,529: ERROR/MainProcess] Task ceph_installer.tasks.call_ansible[b0fff403-a785-4fc9-939c-2acd0be7e608] raised unexpected: InvalidRequestError('This Session\'s transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. u\'UPDATE tasks SET stderr=?, stdout=?, ended=?, succeeded=?, exit_code=? WHERE tasks.id = ?\' (\'\', \'\\nPLAY [mons] 


so hung is somewhat incorrect. It seems like after this requests are failing to respond despite the request appearing to succeed


https://github.com/ceph/ceph-installer/pull/132/files

Comment 9 Rachana Patel 2016-07-28 23:00:10 UTC
verified with

ceph-ansible-1.0.5-23.el7scon.noarch
ceph-installer-1.0.12-3.el7scon.noarch

working as expected hence moving to verified

Comment 11 errata-xmlrpc 2016-08-23 19:49:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1754


Note You need to log in before you can comment on or make changes to this bug.