Created attachment 839003 [details] HE log & vdsm Description of problem: vdsm asks for configuration before deployment of HE. See add info for vdsm.log ... IOError: [Errno 2] No such file or directory: '/etc/ovirt-hosted-engine/hosted-engine.conf' ... Version-Release number of selected component (if applicable): is27 How reproducible: 100% Steps to Reproduce: 1. have HE installed, then try to disassemble it 2. yum erase ovirt-host\* and remove /etc/ovirt* 3. try install HE again Actual results: HE installation never ends - it stops on this line. ... [ INFO ] Start monitoring domain Expected results: Additional info: ... Thread-1046::ERROR::2013-12-19 13:19:24,958::API::1223::vds::(getStats) failed to retrieve Hosted Engine HA score Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1221, in getStats stats['haScore'] = haClient.HAClient().get_local_host_score() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 193, in get_local_host_score self._config = config.Config() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/env/config.py", line 57, in __init__ self._load(Config.static_files) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/env/config.py", line 63, in _load with open(fname, 'r') as f: IOError: [Errno 2] No such file or directory: '/etc/ovirt-hosted-engine/hosted-engine.conf' ...
Looks like setup left-overs which we should handle properly.
(In reply to Doron Fediuck from comment #2) > Looks like setup left-overs which we should handle properly. Seems more a bug in vdsm getStats: it's trying to read a file that is not there because the deploy is not done yet. I think Greg has already seen it.
After looking into it - there's another error in the log that is causing setup to stall: Thread-49::DEBUG::2013-12-19 13:25:44,912::domainMonitor::263::Storage.DomainMonitorThread::(_monitorDomain) Unable to issue the acquire host id 1 request for domain 4eea45f1-0be1-4c5c-9ec3-1460a16de055 Traceback (most recent call last): File "/usr/share/vdsm/storage/domainMonitor.py", line 259, in _monitorDomain self.domain.acquireHostId(self.hostId, async=True) File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('4eea45f1-0be1-4c5c-9ec3-1460a16de055', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) That being said, the exception from the HAClient library is a bit unsightly but should be recorded somewhere. We might be able to clean up the logs a little by reducing this to a simple error rather than always logging the whole backtrace.
moving to 3.3.2 since 3.3.1 was built and moved to QE.
Adding a patch to reduce the noise from the haclient and reassigning to vdsm to deal with the sanlock part.
Is the change already in 3.5 and 3.4? Is this a 3.3 issue only?
(In reply to Sandro Bonazzola from comment #7) > Is the change already in 3.5 and 3.4? Is this a 3.3 issue only? - it exists in 3.4 and 3.5 and is not fixed, good catch, going to post a patch in a moment
this bug status was moved to MODIFIED before vdsm vt5 was built, hence moving to on_qa, if this was mistake and the fix isn't in, please contact rhev-integ
Verified on vt5
*** Bug 1150285 has been marked as a duplicate of this bug. ***
The patch is missing in vdsm-4.16.7
Fixed in vt13
Verified on vdsm-4.16.8.1-2.el6ev.x86_64 and ovirt-hosted-engine-ha-1.2.4-2.el6ev.noarch Redeployment success.
RHEV 3.5.1 was GA'd. closing current release.