Description of problem ====================== When "Create Cluster" task fail, I see discrepancy in it's status reported by various components of usm web ui. Version-Release number of selected component ============================================ rhscon-ceph-0.0.6-8.el7.x86_64 rhscon-core-0.0.8-7.el7.x86_64 rhscon-ui-0.0.16-1.el7.noarch How reproducible ================ Hard to say, because this BZ assumes that cluster creation fails in a specific way. Steps to Reproduce ================== 1. Install skyring on server and prepare few hosts for cluster setup 2. Accept all nodes 3. Start "Create Cluster" wizard and create a cluster using def. config 4. The Create Cluster task fails Actual results ============== State of the cluster and create cluster task is not aligned through various components of usm web ui: * task popup window still shows unfinished "running" progressbar for this task * clusters page shows this cluster in a failed state (red cross in a circle icon is displayed next to the cluster name) * tasks page shows the task as *Failed*, but the failed icon is missing here The other problem is that when I let the task progressbar running for about an hour, it didn't change in any way. Expected results ================ All usm web components would show that the task either still running or failed, discrepancies between task popup and tasks page should not happen. Taks progressbar should stop/finish immediately when the task fails. Additional info =============== Hopefully useful/related part of the logs, full skyring.log file attached. ~~~ 2016-02-25T11:03:26+0000 INFO saltwrapper.py:50 saltwrapper.wrapper] rv={'mbukatov-usm1-node4.example.com': {'file_|-/etc/ceph/mbukatov-usm1-cluster1.conf_| -/etc/ceph/mbukatov-usm1-cluster1.conf_|-managed': {'comment': 'File /etc/ceph/mbukatov-usm1-cluster1.conf is in the correct state', 'name': '/etc/ceph/mbukatov-usm1-cl uster1.conf', 'start_time': '11:03:26.342414', 'result': True, 'duration': 16.959, '__run_num__': 0, 'changes': {}}}} 2016-02-25T11:03:26+0000 ERROR saltwrapper.py:498 saltwrapper.AddOSD] admin:4b937f55-f6b0-4962-b0be-c07a119cd6fc-add_osd failed. error={'mbukatov-usm1-node4.os1.phx2 .redhat.com': {'pid': 31390, 'retcode': 1, 'stderr': "2016-02-25 11:03:25.653244 7fe2efe24780 -1 did not load config file, using default settings.\n2016-02-25 11:03:25. 665761 7f1ef057d780 -1 did not load config file, using default settings.\nlibust[31401/31401]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)\nceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'mbukatov-usm1-cl uster1', 'start', 'osd.3']' returned non-zero exit status 1\nceph-disk: Error: One or more partitions failed to activate", 'stdout': '/etc/init.d/ceph: osd.3 not found (/etc/ceph/mbukatov-usm1-cluster1.conf defines osd.usm1-cluster1-3 mon.a, /var/lib/ceph defines osd.usm1-cluster1-3)'}} ESC[31m2016-02-25T11:03:26.381+01:00 ERROR utils.go:133 FailTask]ESC[0m admin:4b937f55-f6b0-4962-b0be-c07a119cd6fc-Failed adding all OSDs while create cluster mbukat ov-usm1-cluster1: <nil> ESC[31m2016-02-25T11:03:28.359+01:00 ERROR cluster.go:227 func·001]ESC[0m admin:4b937f55-f6b0-4962-b0be-c07a119cd6fc- Failed to create the cluster mbukatov-usm1-clus ter1 2016-02-25T11:03:28.359+01:00 ERROR cluster.go:227 func·001] admin:4b937f55-f6b0-4962-b0be-c07a119cd6fc- Failed to create the cluster mbukatov-usm1-cluster1 ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:75 ReleaseLock]ESC[0m Currently Locked:%!(EXTRA map[uuid.UUID]*lock.LockInternal=map[251ae1f8-c9bd-43bf-a78c -10aa5afe7ec9:0xc20b4527c0 de342cd8-2535-4eb2-8583-ce03535c3fe7:0xc20b4529e0 462b1a36-c987-4a2b-806d-2b199e632630:0xc20b452c00 870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0:0xc2 0b452e20 fdff7bc2-1c27-4454-b775-f7ec48a52edb:0xc20b453040]) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:75 ReleaseLock] Currently Locked:%!(EXTRA map[uuid.UUID]*lock.LockInternal=map[251ae1f8-c9bd-43bf-a78c-10aa5afe7ec9 :0xc20b4527c0 de342cd8-2535-4eb2-8583-ce03535c3fe7:0xc20b4529e0 462b1a36-c987-4a2b-806d-2b199e632630:0xc20b452c00 870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0:0xc20b452e20 fdff 7bc2-1c27-4454-b775-f7ec48a52edb:0xc20b453040]) ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:76 ReleaseLock]ESC[0m Releasing the locks for:%!(EXTRA map[uuid.UUID]string=map[251ae1f8-c9bd-43bf-a78c-10aa 5afe7ec9:POST_Clusters : mbukatov-usm1-node3.example.com de342cd8-2535-4eb2-8583-ce03535c3fe7:POST_Clusters : mbukatov-usm1-node4.example.com 462b1a36-c 987-4a2b-806d-2b199e632630:POST_Clusters : mbukatov-usm1-mon1.example.com 870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0:POST_Clusters : mbukatov-usm1-node1.os1.phx2.redh at.com fdff7bc2-1c27-4454-b775-f7ec48a52edb:POST_Clusters : mbukatov-usm1-node2.example.com]) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:76 ReleaseLock] Releasing the locks for:%!(EXTRA map[uuid.UUID]string=map[870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0:POST _Clusters : mbukatov-usm1-node1.example.com fdff7bc2-1c27-4454-b775-f7ec48a52edb:POST_Clusters : mbukatov-usm1-node2.example.com 251ae1f8-c9bd-43bf-a78c -10aa5afe7ec9:POST_Clusters : mbukatov-usm1-node3.example.com de342cd8-2535-4eb2-8583-ce03535c3fe7:POST_Clusters : mbukatov-usm1-node4.example.com 462b1 a36-c987-4a2b-806d-2b199e632630:POST_Clusters : mbukatov-usm1-mon1.example.com]) ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock]ESC[0m Lock Released: %!(EXTRA uuid.UUID=fdff7bc2-1c27-4454-b775-f7ec48a52edb) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock] Lock Released: %!(EXTRA uuid.UUID=fdff7bc2-1c27-4454-b775-f7ec48a52edb) ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock]ESC[0m Lock Released: %!(EXTRA uuid.UUID=251ae1f8-c9bd-43bf-a78c-10aa5afe7ec9) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock] Lock Released: %!(EXTRA uuid.UUID=251ae1f8-c9bd-43bf-a78c-10aa5afe7ec9) ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock]ESC[0m Lock Released: %!(EXTRA uuid.UUID=de342cd8-2535-4eb2-8583-ce03535c3fe7) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock] Lock Released: %!(EXTRA uuid.UUID=de342cd8-2535-4eb2-8583-ce03535c3fe7) ESC[36m2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock]ESC[0m Lock Released: %!(EXTRA uuid.UUID=462b1a36-c987-4a2b-806d-2b199e632630) 2016-02-25T11:03:28.36+01:00 DEBUG lockmanager.go:83 ReleaseLock] Lock Released: %!(EXTRA uuid.UUID=462b1a36-c987-4a2b-806d-2b199e632630) ESC[36m2016-02-25T11:03:28.361+01:00 DEBUG lockmanager.go:83 ReleaseLock]ESC[0m Lock Released: %!(EXTRA uuid.UUID=870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0) 2016-02-25T11:03:28.361+01:00 DEBUG lockmanager.go:83 ReleaseLock] Lock Released: %!(EXTRA uuid.UUID=870d6ca1-06ff-45a8-a9e5-aaa8f06bcbb0) ESC[36m2016-02-25T11:03:28.361+01:00 DEBUG lockmanager.go:86 ReleaseLock]ESC[0m Currently Locked:%!(EXTRA map[uuid.UUID]*lock.LockInternal=map[]) 2016-02-25T11:03:28.361+01:00 DEBUG lockmanager.go:86 ReleaseLock] Currently Locked:%!(EXTRA map[uuid.UUID]*lock.LockInternal=map[]) ~~~
Created attachment 1130492 [details] screenshot of progressbar and clusters page
Created attachment 1130493 [details] screenshot of progressbar and tasks page
Updating the BZ so that this issue is easier to reproduce with the latest builds. ceph-0.94.5-9.el7cp.x86_64 ceph-ansible-1.0.1-1.20160307gitb354445.el7.noarch ceph-common-0.94.5-9.el7cp.x86_64 redhat-ceph-installer-0.2.3-1.20160304gitb3e3c68.el7.noarch rhscon-ceph-0.0.6-14.el7.x86_64 rhscon-core-0.0.8-14.el7.x86_64 rhscon-ui-0.0.23-1.el7.noarch Steps to Reproduce ================== 1. Prepare node machines for the cluster. Make sure that all additional disks on OSD machines have a zero size (this will make the create cluster task fail later). 2. Install skyring on usm server machine. 3. Accept all nodes (machines you have prepared in the step #1). 4. Start "Create Cluster" wizard and create a cluster using def. config 5. Wait until the Create Cluster task fails as expected. Actual results ============== State of the cluster and create cluster task is not aligned through various components of usm web ui: * Tasks page shows the task as *Failed* (which is a correct description here). * Clusters page still shows the cluster in a 'Creating' state, the progressbar is unfinished and reads 'Creating'.
*** Bug 1341504 has been marked as a duplicate of this bug. ***
Fix patch: https://review.gerrithub.io/#/c/281235/
Tested with ceph-ansible-1.0.5-32.el7scon.noarch ceph-installer-1.0.14-1.el7scon.noarch rhscon-ceph-0.0.40-1.el7scon.x86_64 rhscon-core-0.0.41-1.el7scon.x86_64 rhscon-core-selinux-0.0.41-1.el7scon.noarch rhscon-ui-0.0.52-1.el7scon.noarch and 1) cluster creation task has passed even if no OSD has been added 2) cluster is in failed state because there is no OSD For 1) there is already BZ 2) is OK --> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1754