Bug 1171452

Summary: Host that added to HE environment after upgrade from 3.4->3.5 failed to connect to storage
Product: Red Hat Enterprise Virtualization Manager Reporter: Artyom <alukiano>
Component: ovirt-hosted-engine-haAssignee: Doron Fediuck <dfediuck>
Status: CLOSED ERRATA QA Contact: Artyom <alukiano>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: alukiano, amureini, cshao, dfediuck, ebenahar, ecohen, gklein, huiwa, iheim, lsurette, mavital, sbonazzo, sherold, stirabos, ycui
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: sla
Fixed In Version: ovirt-hosted-engine-ha-1.2.4-5.el6ev Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-11 21:09:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1157243, 1160808, 1164308, 1164311, 1173951    
Attachments:
Description Flags
logs and config file
none
ovirt-hosted-setup
none
setup log none

Description Artyom 2014-12-07 07:21:34 UTC
Created attachment 965516 [details]
logs and config file

Description of problem:
I have HE environment with two hosts after upgrade process from 3.4 to 3.5, and now when I have 3.5 HE environment, I want to add additional host. Setup process pass, but after this broker on new host can't connect to storage.

Version-Release number of selected component (if applicable):
upgrade from ovirt-hosted-engine-ha-1.1.6-3.el6ev.noarch to ovirt-hosted-engine-ha-1.2.4-2.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Setup 3.4 HE environment with two hosts
2. Upgrade HE environment to 3.5(you can find upgrade process in ovirt wiki)
3. Try to add new host and check status

Actual results:
Deployment of host pass, but after broker can not connect to storage.

Expected results:
Broker connect to storage and host retrieve all information about HE environment.

Additional info:
I also added hosted-engine.conf to archive file

Comment 1 Allon Mureinik 2014-12-08 11:22:36 UTC
(In reply to Artyom from comment #0)
> Created attachment 965516 [details]
> Steps to Reproduce:
> 2. Upgrade HE environment to 3.5(you can find upgrade process in ovirt wiki)
Seriously?
You can't even provide a link to the page?

Comment 2 Artyom 2014-12-08 11:34:11 UTC
http://www.ovirt.org/Hosted_Engine_Howto

Comment 3 Allon Mureinik 2014-12-08 12:24:19 UTC
The underlying error comes from VDSM:

Thread-54::INFO::2014-12-04 18:23:05,549::logUtils::44::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='d67ca20f-a4be-4b3e-9a96-78474491111f', spUUID='00000000-0000-0000-0000-000000000000', imgUUID='None', leafUUID='None')

Since imgUUID and leafUUID are non, the image, obviously, can't be prepared.
Need insight from the hosted-engine engineers to check why this is sent to VDSM.
Simone - can you take a look and either explain why this input is expected or re-assign to someone from HE's dev team if it isn't?

Thanks!

Comment 4 Simone Tiraboschi 2014-12-09 10:20:51 UTC
Artyom, can you please attach also hosted-engine-setup log files from
/var/log/ovirt-hosted-engine-setup/ on the failing host?

Comment 5 Artyom 2014-12-09 12:48:47 UTC
Created attachment 966218 [details]
ovirt-hosted-setup

Comment 6 Allon Mureinik 2014-12-09 14:47:45 UTC
Artyom - thanks.

Returning needinfo on Simone - I still need your input on comment 3.
Thanks!

Comment 7 Simone Tiraboschi 2014-12-09 15:26:47 UTC
prepareImage is never directly called by hosted-engine-setup but is called at runtime by hosted-engine-ha.
Adding Jiri Moskovcak into the loop.

Comment 8 Jiri Moskovcak 2014-12-10 08:53:00 UTC
This is on hosted-engine-ha side, we should check if the imgUUID and volUUID provided in config is valid UUID and use the old NFS handling code in if it's not.

Comment 9 Jiri Moskovcak 2014-12-10 09:54:06 UTC
Can you please apply the suggested patch locally and verify it, it would help me a lot, coz I don't have the environment to test it now.

Comment 10 Artyom 2014-12-10 10:00:21 UTC
I can, but it will take time because I need to configure environment from zero.
So today by evening or tomorrow I will give you result.

Comment 11 Jiri Moskovcak 2014-12-11 13:57:21 UTC
(In reply to Artyom from comment #10)
> I can, but it will take time because I need to configure environment from
> zero.
> So today by evening or tomorrow I will give you result.

Any updates?

Comment 12 Artyom 2014-12-14 11:40:02 UTC
Sorry for late answer, it still the same also after that I added your path to host that I want to add and restarted ovirt-ha-agent service
Broker.log:
Thread-130::ERROR::2014-12-14 11:35:47,401::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Error handling request, data: 'set-storage-domain VdsmBackend hosted-engine.lockspace=7B22696D6167655F75756964223A20224E6F6E65222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A20224E6F6E65227D sp_uuid=576a5759-1684-405d-bae8-5e84cc3aee80 dom_type=nfs3 hosted-engine.metadata=7B22696D6167655F75756964223A20224E6F6E65222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A20224E6F6E65227D sd_uuid=5e810048-26a3-45d9-94d9-86c50bfab68d'
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle
    data)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch
    .set_storage_domain(client, sd_type, **options)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain
    self._backends[client].connect()
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 370, in connect
    raise RuntimeError(response["status"]["message"])
RuntimeError: Volume does not exist: ('None',)

Comment 15 Sandro Bonazzola 2014-12-19 09:50:40 UTC
*** Bug 1175793 has been marked as a duplicate of this bug. ***

Comment 16 Artyom 2014-12-21 11:29:04 UTC
Created attachment 971690 [details]
setup log

Checked on ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
This patch also prevent me to give partial answer file to deployment, like:
12:58:23 [environment:default]
12:58:23 OVEHOSTED_VDSM/consoleType=str:vnc
12:58:23 OVEHOSTED_VDSM/cpu=str:model_Conroe
12:58:23 OVEHOSTED_NOTIF/smtpPort=str:25
12:58:23 OVEHOSTED_NOTIF/smtpServer=str:localhost
12:58:23 OVEHOSTED_NOTIF/destEmail=str:root@localhost
12:58:23 OVEHOSTED_NOTIF/sourceEmail=str:root@localhost
12:58:23 OVEHOSTED_VM/vmVCpus=str:2
12:58:23 OVEHOSTED_VM/vmMemSizeMB=str:4096
12:58:23 OVEHOSTED_VM/vmMACAddr=str:00:16:3E:14:E7:A2
12:58:23 OVEHOSTED_VM/vmBoot=str:pxe
12:58:23 OVEHOSTED_VM/emulatedMachine=str:rhel6.5.0
12:58:23 OVEHOSTED_VM/ovfArchive=none:None
12:58:23 OVEHOSTED_VM/vmCDRom=none:None
12:58:23 OVEHOSTED_ENGINE/appHostName=str:hosted_engine_1
12:58:23 OVEHOSTED_ENGINE/adminPassword=str:123456
12:58:23 OVEHOSTED_STORAGE/imgSizeGB=str:25
12:58:23 OVEHOSTED_STORAGE/storageDomainName=str:hosted_storage
12:58:23 OVEHOSTED_STORAGE/storageDatacenterName=str:hosted_datacenter
12:58:23 OVEHOSTED_CORE/confirmSettings=bool:True
12:58:23 OVEHOSTED_CORE/screenProceed=bool:True
12:58:23 OVEHOSTED_CORE/deployProceed=bool:True
12:58:23 OVEHOSTED_NETWORK/bridgeName=str:rhevm
12:58:23 OVEHOSTED_NETWORK/bridgeIf=none:None
12:58:23 OVEHOSTED_NETWORK/firewallManager=str:iptables
12:58:23 OVEHOSTED_NETWORK/fqdn=str:alukiano-he-1.qa.lab.tlv.redhat.com

deployment will failed with message:
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/sanlock/lockspace.py", line 150, in _misc
    **activate_devices
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 228, in __init__
    activate_devices.items()
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 227, in <lambda>
    (service, self.Device.device_from_str(device)),
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 193, in device_from_str
    dev = cls(None, None, None)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 152, in __init__
    "image_uuid or volume_uuid is missing or None"
ValueError: image_uuid or volume_uuid is missing or None

Comment 17 Artyom 2014-12-21 14:33:14 UTC
It also prevent regular installation for nfs storage, with the same error

Comment 18 Elad 2014-12-21 15:57:08 UTC
Happens also on blcok, as reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=1175793

Comment 19 Jiri Moskovcak 2015-01-02 13:34:36 UTC
It's fixed in ovirt-hosted-engine-ha-1.2.4-5.el6ev.noarch, can you please retest?

Comment 20 Artyom 2015-01-04 14:59:56 UTC
Verified on ovirt-hosted-engine-ha-1.2.4-5.el6ev.noarch
Both hosts runs fine:
--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : 10.35.64.85
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 7792
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7792 (Sun Jan  4 16:58:49 2015)
        host-id=1
        score=2400
        maintenance=False
        state=EngineUp


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : cyan-vdsf.qa.lab.tlv.redhat.com
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 585
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=585 (Sun Jan  4 16:58:53 2015)
        host-id=2
        score=2400
        maintenance=False
        state=EngineDown

Comment 23 errata-xmlrpc 2015-02-11 21:09:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0194.html