Bug 1171452
Summary: | Host that added to HE environment after upgrade from 3.4->3.5 failed to connect to storage | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Artyom <alukiano> | ||||||||
Component: | ovirt-hosted-engine-ha | Assignee: | Doron Fediuck <dfediuck> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 3.5.0 | CC: | alukiano, amureini, cshao, dfediuck, ebenahar, ecohen, gklein, huiwa, iheim, lsurette, mavital, sbonazzo, sherold, stirabos, ycui | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | 3.5.0 | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | sla | ||||||||||
Fixed In Version: | ovirt-hosted-engine-ha-1.2.4-5.el6ev | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2015-02-11 21:09:40 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1157243, 1160808, 1164308, 1164311, 1173951 | ||||||||||
Attachments: |
|
(In reply to Artyom from comment #0) > Created attachment 965516 [details] > Steps to Reproduce: > 2. Upgrade HE environment to 3.5(you can find upgrade process in ovirt wiki) Seriously? You can't even provide a link to the page? The underlying error comes from VDSM: Thread-54::INFO::2014-12-04 18:23:05,549::logUtils::44::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='d67ca20f-a4be-4b3e-9a96-78474491111f', spUUID='00000000-0000-0000-0000-000000000000', imgUUID='None', leafUUID='None') Since imgUUID and leafUUID are non, the image, obviously, can't be prepared. Need insight from the hosted-engine engineers to check why this is sent to VDSM. Simone - can you take a look and either explain why this input is expected or re-assign to someone from HE's dev team if it isn't? Thanks! Artyom, can you please attach also hosted-engine-setup log files from /var/log/ovirt-hosted-engine-setup/ on the failing host? Created attachment 966218 [details]
ovirt-hosted-setup
Artyom - thanks. Returning needinfo on Simone - I still need your input on comment 3. Thanks! prepareImage is never directly called by hosted-engine-setup but is called at runtime by hosted-engine-ha. Adding Jiri Moskovcak into the loop. This is on hosted-engine-ha side, we should check if the imgUUID and volUUID provided in config is valid UUID and use the old NFS handling code in if it's not. Can you please apply the suggested patch locally and verify it, it would help me a lot, coz I don't have the environment to test it now. I can, but it will take time because I need to configure environment from zero. So today by evening or tomorrow I will give you result. (In reply to Artyom from comment #10) > I can, but it will take time because I need to configure environment from > zero. > So today by evening or tomorrow I will give you result. Any updates? Sorry for late answer, it still the same also after that I added your path to host that I want to add and restarted ovirt-ha-agent service Broker.log: Thread-130::ERROR::2014-12-14 11:35:47,401::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Error handling request, data: 'set-storage-domain VdsmBackend hosted-engine.lockspace=7B22696D6167655F75756964223A20224E6F6E65222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A20224E6F6E65227D sp_uuid=576a5759-1684-405d-bae8-5e84cc3aee80 dom_type=nfs3 hosted-engine.metadata=7B22696D6167655F75756964223A20224E6F6E65222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A20224E6F6E65227D sd_uuid=5e810048-26a3-45d9-94d9-86c50bfab68d' Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle data) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch .set_storage_domain(client, sd_type, **options) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain self._backends[client].connect() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 370, in connect raise RuntimeError(response["status"]["message"]) RuntimeError: Volume does not exist: ('None',) *** Bug 1175793 has been marked as a duplicate of this bug. *** Created attachment 971690 [details]
setup log
Checked on ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
This patch also prevent me to give partial answer file to deployment, like:
12:58:23 [environment:default]
12:58:23 OVEHOSTED_VDSM/consoleType=str:vnc
12:58:23 OVEHOSTED_VDSM/cpu=str:model_Conroe
12:58:23 OVEHOSTED_NOTIF/smtpPort=str:25
12:58:23 OVEHOSTED_NOTIF/smtpServer=str:localhost
12:58:23 OVEHOSTED_NOTIF/destEmail=str:root@localhost
12:58:23 OVEHOSTED_NOTIF/sourceEmail=str:root@localhost
12:58:23 OVEHOSTED_VM/vmVCpus=str:2
12:58:23 OVEHOSTED_VM/vmMemSizeMB=str:4096
12:58:23 OVEHOSTED_VM/vmMACAddr=str:00:16:3E:14:E7:A2
12:58:23 OVEHOSTED_VM/vmBoot=str:pxe
12:58:23 OVEHOSTED_VM/emulatedMachine=str:rhel6.5.0
12:58:23 OVEHOSTED_VM/ovfArchive=none:None
12:58:23 OVEHOSTED_VM/vmCDRom=none:None
12:58:23 OVEHOSTED_ENGINE/appHostName=str:hosted_engine_1
12:58:23 OVEHOSTED_ENGINE/adminPassword=str:123456
12:58:23 OVEHOSTED_STORAGE/imgSizeGB=str:25
12:58:23 OVEHOSTED_STORAGE/storageDomainName=str:hosted_storage
12:58:23 OVEHOSTED_STORAGE/storageDatacenterName=str:hosted_datacenter
12:58:23 OVEHOSTED_CORE/confirmSettings=bool:True
12:58:23 OVEHOSTED_CORE/screenProceed=bool:True
12:58:23 OVEHOSTED_CORE/deployProceed=bool:True
12:58:23 OVEHOSTED_NETWORK/bridgeName=str:rhevm
12:58:23 OVEHOSTED_NETWORK/bridgeIf=none:None
12:58:23 OVEHOSTED_NETWORK/firewallManager=str:iptables
12:58:23 OVEHOSTED_NETWORK/fqdn=str:alukiano-he-1.qa.lab.tlv.redhat.com
deployment will failed with message:
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
method['method']()
File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/sanlock/lockspace.py", line 150, in _misc
**activate_devices
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 228, in __init__
activate_devices.items()
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 227, in <lambda>
(service, self.Device.device_from_str(device)),
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 193, in device_from_str
dev = cls(None, None, None)
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 152, in __init__
"image_uuid or volume_uuid is missing or None"
ValueError: image_uuid or volume_uuid is missing or None
It also prevent regular installation for nfs storage, with the same error Happens also on blcok, as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1175793 It's fixed in ovirt-hosted-engine-ha-1.2.4-5.el6ev.noarch, can you please retest? Verified on ovirt-hosted-engine-ha-1.2.4-5.el6ev.noarch Both hosts runs fine: --== Host 1 status ==-- Status up-to-date : True Hostname : 10.35.64.85 Host ID : 1 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 2400 Local maintenance : False Host timestamp : 7792 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=7792 (Sun Jan 4 16:58:49 2015) host-id=1 score=2400 maintenance=False state=EngineUp --== Host 2 status ==-- Status up-to-date : True Hostname : cyan-vdsf.qa.lab.tlv.redhat.com Host ID : 2 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 2400 Local maintenance : False Host timestamp : 585 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=585 (Sun Jan 4 16:58:53 2015) host-id=2 score=2400 maintenance=False state=EngineDown Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0194.html |
Created attachment 965516 [details] logs and config file Description of problem: I have HE environment with two hosts after upgrade process from 3.4 to 3.5, and now when I have 3.5 HE environment, I want to add additional host. Setup process pass, but after this broker on new host can't connect to storage. Version-Release number of selected component (if applicable): upgrade from ovirt-hosted-engine-ha-1.1.6-3.el6ev.noarch to ovirt-hosted-engine-ha-1.2.4-2.el6ev.noarch How reproducible: Always Steps to Reproduce: 1. Setup 3.4 HE environment with two hosts 2. Upgrade HE environment to 3.5(you can find upgrade process in ovirt wiki) 3. Try to add new host and check status Actual results: Deployment of host pass, but after broker can not connect to storage. Expected results: Broker connect to storage and host retrieve all information about HE environment. Additional info: I also added hosted-engine.conf to archive file