Bug 1517881 - Deployment of SHE stuck on the appliance.
Summary: Deployment of SHE stuck on the appliance.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: 2.2.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ovirt-4.2.0
: 2.2.0
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
: 1518850 1540123 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-27 16:13 UTC by Nikolai Sednev
Modified: 2018-01-30 10:55 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Due to an SELinux policy issue, engine-setup was failing if executed by cloud-init. Temporary set permissive mode.
Clone Of:
Environment:
Last Closed: 2017-12-20 11:10:16 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+
rule-engine: planning_ack+
sbonazzo: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
logs from alma03 (9.31 MB, application/x-xz)
2017-11-27 16:13 UTC, Nikolai Sednev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1540123 0 unspecified CLOSED [HE] failed to deploy hosted engine with --noansible 2021-02-22 00:41:40 UTC
oVirt gerrit 84740 0 master MERGED selinux: set permissive mode at engine-setup time 2020-03-21 20:54:03 UTC

Internal Links: 1540123

Description Nikolai Sednev 2017-11-27 16:13:43 UTC
Created attachment 1359541 [details]
logs from alma03

Description of problem:
Deployment of SHE stuck on "Stage: Misc configuration" and then after long timeout deployment fails with:
[ ERROR ] Engine setup got stuck on the appliance
[ ERROR ] Failed to execute stage 'Closing up': Engine setup is stalled on the appliance since 1800 seconds ago.
         Please check its log on the appliance.


Looks like this is the issue:
2017-11-27 17:36:28,256+0200 INFO  (jsonrpc/7) [vdsm.api] FINISH prepareImage error=Volume does not exist: (u'24e0ccff
-f029-4b23-930b-ecb85ab11924',) from=::1,43820, task_id=be70e5cd-4dc8-4166-8373-07581905b1d1 (api:50)
2017-11-27 17:36:28,257+0200 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='be70e5cd-4dc8-4166-8373-07581905b1d1'
) Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in prepareImage
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3157, in prepareImage
    raise se.VolumeDoesNotExist(leafUUID)
VolumeDoesNotExist: Volume does not exist: (u'24e0ccff-f029-4b23-930b-ecb85ab11924',)
2017-11-27 17:36:28,257+0200 INFO  (jsonrpc/7) [storage.TaskManager.Task] (Task='be70e5cd-4dc8-4166-8373-07581905b1d1'
) aborting: Task is aborted: "Volume does not exist: (u'24e0ccff-f029-4b23-930b-ecb85ab11924',)" - code 201 (task:1181
)
2017-11-27 17:36:28,258+0200 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH prepareImage error=Volume does not exist: (
u'24e0ccff-f029-4b23-930b-ecb85ab11924',) (dispatcher:82)
2017-11-27 17:36:28,258+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Image.prepare failed (error 201) in 0.
01 seconds (__init__:573)

Host alma03 has gluster volume mounted as follows:
gluster01.scl.lab.tlv.redhat.com:/nsednev_he_1 on /rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_nsednev__he__1 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.0-0.0.master.20171124110627.gitc5547b6.el7.centos.noarch
ovirt-hosted-engine-ha-2.2.0-0.0.master.20171122155227.20171122155225.gitbc3ec09.el7.centos.noarch
ovirt-engine-appliance-4.2-20171126.1.el7.centos.noarch

How reproducible:
100%

Steps to Reproduce:
1.Deploy SHE over Gluster volume.

Actual results:
Deployment getting stuck.

Expected results:
Deployment should be successful. 

Additional info:
logs from host attached.

Comment 1 Nikolai Sednev 2017-11-27 16:18:00 UTC
Deployment details available from here: http://pastebin.test.redhat.com/535404

Comment 2 Simone Tiraboschi 2017-11-27 16:19:10 UTC
(In reply to Nikolai Sednev from comment #0)
> Description of problem:
> Deployment of SHE stuck on "Stage: Misc configuration" and then after long
> timeout deployment fails with:
> [ ERROR ] Engine setup got stuck on the appliance

We need engine-setup logs to check what happened there

> [ ERROR ] Failed to execute stage 'Closing up': Engine setup is stalled on
> the appliance since 1800 seconds ago.
>          Please check its log on the appliance.
> 
> 
> Looks like this is the issue:

> VolumeDoesNotExist: Volume does not exist:
> (u'24e0ccff-f029-4b23-930b-ecb85ab11924',)

This is a false positive: hosted-engine-setup is just checking for volume existence before creating it

Comment 3 Nikolai Sednev 2017-11-27 16:22:28 UTC
I've tried to get in to the engine, it was alive, but I could not log in to it.

Comment 4 Simone Tiraboschi 2017-11-27 16:48:55 UTC
(In reply to Simone Tiraboschi from comment #2)
> We need engine-setup logs to check what happened there

It seams still SELinux related

Comment 5 Nikolai Sednev 2017-11-27 16:49:12 UTC
The same failure happens also over NFS deployment, thus making this bug not storage type specific.

Comment 6 Red Hat Bugzilla Rules Engine 2017-11-29 07:27:08 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 7 Simone Tiraboschi 2017-11-29 16:24:09 UTC
*** Bug 1518850 has been marked as a duplicate of this bug. ***

Comment 8 Simone Tiraboschi 2017-12-06 13:07:52 UTC
*** Bug 1522641 has been marked as a duplicate of this bug. ***

Comment 9 Nikolai Sednev 2017-12-12 14:35:43 UTC
Deployed on RHEL7.4 hosts, using:
ovirt-hosted-engine-ha-2.2.0-0.0.master.20171128125909.20171128125907.gitfa5daa6.el7.centos.noarch
ovirt-hosted-engine-setup-2.2.0-0.0.master.20171129192644.git440040c.el7.centos.noarch
ovirt-engine-appliance-4.2-20171129.1.el7.centos.noarch

Over Gluster - passed; 
Over NFS - passed; 
Over iSCSI - passed; 

Moving to verified.

Comment 10 Sandro Bonazzola 2017-12-20 11:10:16 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 11 Simone Tiraboschi 2018-01-30 10:55:10 UTC
*** Bug 1540123 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.