Bug 1099874
| Summary: | [ ERROR ] Failed to execute stage 'Misc configuration': The read operation timed out | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Nikolai Sednev <nsednev> | ||||
| Component: | ovirt-hosted-engine-setup | Assignee: | Yedidyah Bar David <didi> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Leonid Natapov <lnatapov> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.4.0 | CC: | aberezin, alukiano, danken, dfediuck, didi, fsimonce, iheim, lveyde, nsednev, sbonazzo, stirabos, sylvain.deswaerte | ||||
| Target Milestone: | --- | Keywords: | Reopened, Triaged, Unconfirmed | ||||
| Target Release: | 3.4.4 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | integration | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-10-28 08:51:39 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1147410 | ||||||
| Attachments: |
|
||||||
|
Comment 1
Nikolai Sednev
2014-05-21 12:00:13 UTC
Federico, is something changed in VDSM that can cause activateStorageDomain to timeout? Nikolai, can you attach vdsm logs? (In reply to Sandro Bonazzola from comment #3) > Nikolai, can you attach vdsm logs? My bad, can't add them as wiped out my setup last week, will add when this scenario will happen again. Please reopen if you manage to reproduce, thanks. Created attachment 942248 [details]
sosreport from host deploying the self hosted engine
Hello, I had exactly the same problem, solved by manually deleting contetn of the storage domain. I attached the sosreport. Regards didi can you check if this is duplicate of bug #1152564 ? *** Bug 1152564 has been marked as a duplicate of this bug. *** you know it pass week and it's my work host, so I can just hold in some state with broken hosted-engine --deploy. Week it's not enough to get all logs from host? Works for me on these components and had not reproduced since then until Today: mom-0.4.1-4.el6ev.noarch libvirt-0.10.2-46.el6_6.2.x86_64 vdsm-4.16.8.1-3.el6ev.x86_64 ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch sanlock-2.8-1.el6.x86_64 ovirt-host-deploy-1.3.0-2.el6ev.noarch ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch rhevm-3.5.0-0.25.el6ev.noarch Looked at this again after a report on users@.
Managed to reproduce and fix. Updating here for reference.
Reproduction was, more-or-less:
Created a VM with nested-kvm to be used as a host
Installed RHEL6
yum install ovirt-hosted-engine-setup from 3.4 repo
hosted-engine --deploy using nfs storage
- On first prompt ("The VM has been started. Install the OS...") replied 3 (abort)
Ran again, it said the machine is running
hosted-engine --vm-poweroff
- it killed the machine
hosted-engine --set-maintenance --mode=global
- it hung, killed it after a few minutes
sanlock status
- output some info
sanlock shutdown -f 1
- it said it shutdown
sanlock status
- no output
Removed the data in the SD created before
hosted-engine --deploy
- This time it failed as in this bug
At this point, running:
vdsClient -s 0 getVdsStats
- also got stuck, didn't see anything suspicios in vdsm.log
How to solve?
vdsm has code to check if hosted-engine is setup, and if so, connects to hosted-engine-ha (agent/broker). But at this point, it is "set up", but ha is still down. Doing:
# rm /etc/ovirt-hosted-engine/hosted-engine.conf
was enough to make it not try that anymore.
Leaving the bug closed for now. If we decide to reopen, the proper subject should probably be "hosted-engine has no cleanup tool". In 3.5 we added an option '4' at that prompt, which also kills the vm and thus manages to release sanlock. I then verified that running deploy again works, which seems enough.
|