Bug 1267337

Summary: Hosted engine fails to complete cleanly, can't see hosted engine VM, and can't import HE storage domain.
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Christopher Miersma <miersma>
Component: Plugins.BlockAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED DUPLICATE QA Contact: Ilanit Stein <istein>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 1.3.0CC: bugs, didi, lveyde, miersma, rgolan, rmartins, sbonazzo, stirabos
Target Milestone: ---Flags: rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-14 15:53:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output from hosted-engine setup
none
Logs from the engine VM in this case
none
Logs from the host
none
Host logs
none
Engine logs with the latest patches none

Description Christopher Miersma 2015-09-29 17:10:06 UTC
Created attachment 1078435 [details]
Output from hosted-engine setup

Description of problem:


Version-Release number of selected component (if applicable): 3.6 release candidate


How reproducible:
This has happened to me twice, once in the beta and again in 3.6rc1

Steps to Reproduce:
1. Install hosted-engine-setup on CentOS 7.
2. Run hosted-engine-setup with Fibre Channel storage
3. Build hosted-engine VM using an ISO image stored locally on the host.

Actual results:
The hosted-engine setup ends with error messages including:
"The VDSM host was found in a failed state. Please check engine and bootstrap installation log." I am able to complete the installation, by using the "reinstall" option on the host in the web interface. At that point, the interface gives the message: "The Hosted Engine Storage Domain doesn't no exist. It should be imported into the setup." When I go to import it, the interface cannot find the volume on which the storage domain resides. I am able to add additional LUNs as storage domains and create VMs, and the HE Storage Domain LUN appears in those interfaces, greyed out.

Expected results:
The hosted-engine should complete cleanly, and either automatically import the storage domain on which the HE VM resides or allow you to import it via the web interface. In addition, the engine VM should appear in the interface.

Additional info:
This could be related to these bugs, but I'm not sure: https://bugzilla.redhat.com/show_bug.cgi?id=1254748
https://bugzilla.redhat.com/show_bug.cgi?id=1221956
https://bugzilla.redhat.com/show_bug.cgi?id=1261996

Comment 1 Christopher Miersma 2015-09-29 17:10:43 UTC
Created attachment 1078436 [details]
Logs from the engine VM in this case

Comment 2 Christopher Miersma 2015-09-29 17:11:23 UTC
Created attachment 1078437 [details]
Logs from the host

Comment 3 Sandro Bonazzola 2015-10-07 15:07:29 UTC
2015-09-28 14:49:53 DEBUG otopi.plugins.ovirt_hosted_engine_setup.storage.heconf heconflib.create_heconfimage:246 stderr: dd: failed to open ‘/rhev/data-center/mnt/blockSD/da105b33-9ef8-4cbd-bb51-916c223ed218/images/551a55b1-cae1-419c-b8bb-944fad65a594/ba9d6d41-407b-4014-ae4d-0fd407c0f451’: Permission denied

2015-09-28 14:49:53 DEBUG otopi.context context._executeMethod:156 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/heconf.py", line 149, in _closeup_create_tar
    dest
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/heconflib.py", line 249, in create_heconfimage
    raise RuntimeError('Unable to write HEConfImage')
RuntimeError: Unable to write HEConfImage
2015-09-28 14:49:53 ERROR otopi.context context._executeMethod:165 Failed to execute stage 'Closing up': Unable to write HEConfImage

Can you please provide the output of "ausearch -m avc -ts 28/09/2015" ?

Can you retry disabling selinux ?

Can you provide the output of: "rpm -qv selinux-policy-targeted" ?

Comment 4 Christopher Miersma 2015-10-08 13:45:10 UTC
(In reply to Sandro Bonazzola from comment #3)
> 2015-09-28 14:49:53 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.heconf
> heconflib.create_heconfimage:246 stderr: dd: failed to open
> ‘/rhev/data-center/mnt/blockSD/da105b33-9ef8-4cbd-bb51-916c223ed218/images/
> 551a55b1-cae1-419c-b8bb-944fad65a594/ba9d6d41-407b-4014-ae4d-0fd407c0f451’:
> Permission denied
> 
> 2015-09-28 14:49:53 DEBUG otopi.context context._executeMethod:156 method
> exception
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in
> _executeMethod
>     method['method']()
>   File
> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-
> setup/storage/heconf.py", line 149, in _closeup_create_tar
>     dest
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/heconflib.py",
> line 249, in create_heconfimage
>     raise RuntimeError('Unable to write HEConfImage')
> RuntimeError: Unable to write HEConfImage
> 2015-09-28 14:49:53 ERROR otopi.context context._executeMethod:165 Failed to
> execute stage 'Closing up': Unable to write HEConfImage
> 
> Can you please provide the output of "ausearch -m avc -ts 28/09/2015" ?
> 
> Can you retry disabling selinux ?
> 
> Can you provide the output of: "rpm -qv selinux-policy-targeted" ?

Unfortunately, I no longer have that particular install up and running, as I have rebuilt on the same hardware. However, I can guarantee that I was running selinux in Permissive mode on both the host and engine VMs. In addition, the new setup has the same issue, but I can't find that error message above in the logs.

Comment 5 Christopher Miersma 2015-10-09 16:42:51 UTC
Created attachment 1081391 [details]
Host logs

I've retried a clean install with the latest build from the snapshots repo. Everything else is as clean as possible, but the hosted-engine domain still does not appear for import. SELinux is in Permissive mode throughout. If you're searching through these logs, the UUID for the hosted engine storage domain is 78501422-e863-47e2-986d-39e11577dc50.

Comment 6 Christopher Miersma 2015-10-09 16:44:36 UTC
Created attachment 1081392 [details]
Engine logs with the latest patches

This is from a new clean installation.

Comment 7 Simone Tiraboschi 2015-10-14 15:53:24 UTC

*** This bug has been marked as a duplicate of bug 1269768 ***